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I.  Introduction 

In  a  variety  of  applications  snch  as  found  in  radar  doppler  processing, 
adaptive  filtering,  speech  processing,  underwater  acoustics,  seismology, 
econometrics,  spectral  estimation  and  array  processing,  it  is  desired  to 
estimate  the  statistical  characteristics  of  a  wide-sense  stationary  time 
series.  More  often  than  not,  this  required  characterization  is  embodied  in 
the  time  series’  autocorrelation  lag  sequence  as  specified  by 

rx(n)  =  E  (x(n+m)x(m)}  (1,1) 

in  which  E  and  —  denote  the  operations  of  expectation  and  complex 
conjugation,  respectively.  From  this  definition,  the  well-known  property 
that  the  autocorrelation  lags  are  complex  conjugate  symmetric  (i.e.,  rx(-n)  = 
Tx(n))  is  readily  established.  We  will  automatically  assume  this  property 
whenever  negative  lag  autocorrelation  elements  (or  their  estimates)  are 
required. 

The  second  order  statistical  characterization  as  represented  by  the 
autocorrelation  sequence  may  be  given  an  'equivalent*  frequency  domain 
interpretation.  Namely,  upon  taking  the  Fourier  transform  of  the 
autocorrelation  sequence,  that  is 

00 

Sx(ejw)  =  ^  rx(n)  e-j*M»  (1.2) 

n=— 00 


we  obtain  the  associated  power  spectral  density  function  Sx(ej“)  in  which  the 
normalized  frequency  variable  o>  takes  on  values  in  [-n,n].  This  function 
possesses  a  number  of  salient  properties  among  which  are  that  it  is  a 
positive  semidefinite,  symmetric  (if  the  time  series  is  real  valued),  and, 
periodic  function  of  u.  This  function  is  seen  to  have  a  Fourier  series 
interpretation  in  which  the  autocorrelation  lags  play  the  role  of  the  Fourier 
coefficients.  It  therefore  follows  that  these  coefficients  may  be  determined 
from  the  power  spectral  density  function  through  the  Fourier  series 
coefficient  integral  expression 

n 

rx(n)  =  L  Sx(eJ“)  ejwn  dm  (1.3) 

2n  J 

-n 
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Relationships  (1.2)  and  (1.3)  fora  a  Fourier  transform  pair  ao  that  knowledge 
of  the  autocorrelation  sequence  is  equivalent  to  knowledge  of  the  power 
spectral  density  function  and  vice  versa.  Ve  belabor  this  point  in  order  to 
establish  the  viewpoint  that  spectral  estimation  and  autocorrelation  lag 
estimation  are  conceptually  equivalent. 

In  the  classical  spectral  estimation  problem,  it  is  desired  to  effect  an 
estimate  of  the  underlying  power  spectral  density  function  with  this  estimate 
being  based  on  only  a  finite  set  of  time  series  observations.  Typically, 
these  observations  will  be  composed  of  a  set  of  contiguous  data  measurements 
taken  at  equispaced  time  intervals  T  as  represented  by 

x(l).  x(2) ,  .  .  .,  x(N)  (1.4) 
where  N  will  be  referred  to  as  the  data  length  and  we  have  chosen  to  suppress 
the  sampling  period  T.  It  is  apparent  that  unless  some  constraints  are 
imposed  on  the  basic  nature  of  the  power  spectral  density  function,  there 
exists  a  fundamental  incompatibility  in  seeking  an  estimate  of  the  infinite 
parameter  function  (1.2)  (i.e.,  the  infinite  set  of  autocorrelation  lags 
rz(n))  based  on  the  finite  set  of  observations  (1.4).  Investigators  have 
often  resolved  this  dilemma  by  postulating  a  finite  parameter  model  for  the 
power  spectral  density  function.  The  time  series  observations  (1.4)  sre  then 
used  to  fix  the  parameters  of  this  parametric  model  using  an  appropriate 
estimation  procedure. 

Without  doubt,  the  most  widely  used  and  studied  of  finite  parametric 
models  are  the  so-called  rational  models.  When  employing  a  rational  model, 
we  are  seeking  to  approximate  the  generally  infinite  series  expansion  (1.2) 
by  a  magnitude  squared  ratio  of  polynomials  in  the  variable  e~jb>,  that  is 


S(eJ«)  - 


b0  +  bje~jw  +  ...  +  bqe_j^“ 


1  ♦  *1  e~Jw  +  ...  +  ape 


-JPw 


(1.5) 


The  finite  number  of  parameters  in  this  model  then  provides  the  mechanism  for 
circumventing  the  aforementioned  parameter  mismatch  dilemma.  Namely,  if  the 
data  length  parameter  N  adequately  exceeds  this  rational  function's  number  of 
parameters  (i.e.,  p+q+1),  then  it  is  feasible  to  utilize  the  given  time 
series  observations  (1.4)  to  estimate  values  for  these  parsmeters.  A  few 
words  are  now  appropriate  concerning  the  adequacy  of 


rational  models  in  representing  power  spectral  density  functions.  It  is  wall 
known  that  if  a  power  spectral  density  function  is  continuous  in  the  variable 
<a,  then  it  may  be  approximated  arbitrarily  closely  by  a  rational  function  of 
form  (1.5)  if  the  order  parameters  p  and  q  are  selected  suitably  large  [41], 
Comforted  by  this  knowledge,  rational  functions  have  become  a  standard  tool 
of  spectral  estimation  theoreticians.  As  an  interesting  side  note,  it  is 
ironical  that  the  origin  of  spectral  estimation  was  in  the  use  of  rational 
models  for  characterizing  time  series  composed  of  sinusoids  in  white  noise. 
Members  of  this  class  of  time  series  possess  discontinuous  power  spectral 
density  functions  and  are  therefore  presumably  not  representable  by  a 
rational  model.  As  we  will  see  in  Section  III,  however,  it  is  possible  to 
suitably  adapt  a  specific  rational  model  so  as  to  satisfactorily  characterize 
this  class  of  time  series. 

This  paper  is  primarily  concerned  with  developing  a  modeling  method 
which  utilizes  an  overdo term ined  set  of  statistical  equations  for  estimating 
a  rational  model's  parameters.  Using  this  approach,  it  is  found  that  the 
resultant  modeling  performance  is  generally  better  than  that  achieved  by 
other  popularly  used  parametric  methods.  Although  the  approach  here  taken 
reflects  heavily  upon  the  author's  previous  works  [15] — [ 22] ,  much  of  this 
paper  will  be  concerned  with  formulating  many  contemporary  spectral 
estimation  methods  in  a  common  autocorrelation  representation  setting.  It 
must  be  emphasized  that  our  main  objective  is  not  that  of  giving  an 
encylopedic  coverage  of  the  many  available  rational  spectral  estimation 
techniques.  This  paper  in  conjunction  with  the  excellent  recent  publications 
[23] ,[31] ,[37] ,  however,  provides  a  reasonable  complete  coverage  of 
parametric  methods. 

In  the  remainder  of  this  section,  we  shall  consider  two  special  classes 
of  rational  functions  and  give  a  brief  historical  perspective  on  their  usage 
in  spectral  estimation  theory.  These  two  classes  are  commonly  referred  to  as 
the  moving  average  (MA)  and  the  autoregressive  (AR)  spectral  models.  A 
moving  average  model  is  defined  to  be  a  rational  function  (1.5)  in  which  all 
the  afc  parameters  are  zero  (i.e.,  it  has  only  numerator  dynamics)  while  an 
autoregressive  model  is  one  for  which  all  the  b^  parameters  are  zero  except 
for  b0  (i.e.,  it  has  only  denominator  dynamics).  By-in-large,  these  two 
classes  of  rational  functions  have  formed  the  basic  modeling  tools  in 
contemporary  spectral  estimation  theory. 


MA  Model 


Fourier  analysis  has  played  a  primary  role  in  much  of  the  earlier  as 
well  as  more  recent  efforts  at  spectrally  characterizing  experimentally 
collected  data.  As  an  example,  Schuster  applied  the  periodogram  method  for 
detecting  hidden  periodicities  in  sun  spot  activity  data  at  the  turn  of  the 
century  [58]  .  In  a  more  recent  classical  work,  Blackman  and  Tukey  presented 
a  generalized  procedure  for  effecting  spectral  estimates  [8] .  This  involved 
the  two  step  procedure  of  (i)  determining  autocorrelation  lag  estimates  rx(n) 
using  the  provided  data,  and,  (ii)  taking  the  Fourier  transform  of  these 
estimates.^  The  power  spectral  density  estimate  which  arose  when  taking 
this  approach  then  took  the  form 

4 

Sx(ej«»)  =  ^  w(n)  rx(n)  e~i^  (1.6) 

n=-q 


where  w(n)  is  a  symmetric  data  window  that  is  chosen  to  achieve  various 
desirable  effects  such  as  side  lobe  reduction.  This  window  is  often  selected 
to  be  rectangular  in  which  case  w(n}  *  1  although  other  choices  may  be  more 
desirable  for  a  given  application.  A  description  of  some  of  the  more  popular 
choices  for  the  data  window  may  be  found  in  numerous  texts  (e.g.,  see  refs. 
[33], [50], [57]). 

In  the  Blackman- Tukey  estimate  (1.6),  it  is  seen  that  only  a  finite 
number  of  summand  terms  (i.e.,  2q+l)  are  involved  in  the  spectral  estimate. 
This  is  a  direct  consequence  of  the  fact  that  only  a  finite  set  of 
autocorrelation  lag  estimates  are  obtainable  from  the  observation  set  (1.4) 
if  standard  lag  estimation  methods  are  employed.  Due  to  this  finite  sum 
structure,  we  will  now  show  that  the  Blackman- Tukey  estimation  method  is  a 
special  case  of  the  more  general  rational  MA  spectral  model.  In  particular, 
a  spectral  model  is  said  to  be  a  moving  average  model  of  order  q  (i.e., 
MA(q))  if  it  may  be  put  into  the  form 


1  We  shall  hereafter  use  the  caret  symbol  (*)  to  denote  a  statistical 
estimate. 


Sjj^(oj“)  =  I  b0  +  bj  e-j“  +  . 


(1.7) 


bq  e' 


■jqw  l" 


=  I  Bq(ej“) 


The  q+1  parameters  b0,  bj,  ...»  bq  which  identify  this  MA(q)  model  are  seen 
to  form  a  qtb  order  polynomial  Bq(eJw)  in  the  variable  e“j“.  A  moving 
average  model  is  then  seen  to  be  a  special  case  of  the  more  general  rational 
model  (1.5)  in  which  the  denominator  polynomial  has  been  set  equal  to  the 
constant  one. 

If  the  polynomial  Bq(eJw)  constituting  the  moving  average  model  (1.7)  is 
factored,  it  is  possible  to  provide  additional  insight  into  a  MA  model’s 
properties.  This  factorization  is  seen  to  give  rise  to  the  equivalent 
representation 

,  q 

SMA(ej“)  =  lbor  TT  (1  “  zke_ju)(l  "  Zkejw>  (1.8) 

k=l 


in  which  the  z^  are  the  roots  of  the  polynomial  Bq(ejw).  The  zeroes  of  a  HA 
spectral  model  are  seen  to  occur  in  reciprocal  pairs.  Due  to  the  basic 
nature  of  this  factorization,  moving  average  models  are  therefore  also 
commonly  referred  to  as  all-zero  models.  If  any  of  the  roots  z^  are  close  to 
the  unit  circle  (i.e.,  zksseJWk),  it  is  clear  that  Sj(A(ejw)  will  contain 
sharply  defined  notches  at  frequencies  in  a  neighborhood  associated  with 
these  roots  (i.e.,  «>  =  w^) .  It  is  therefore  apparent  that  MA  models  will  be 
particularly  effective  when  approximating  spectra  that  contain  sharply 
defined  notches  (zero  like  behavior),  but,  do  not  contain  sharply  defined 
peaks.  Whenever  a  spectrum  contains  sharply  defined  peaks,  it  is  possible  to 
simulate  their  effect  at  the  cost  of  many  additional  zeroes  (i.e.,  a  high  HA 
order)  for  an  adequate  representation.  With  this  in  mind,  MA  models  should 
be  normally  avoided  whenever  a  peaky  type  behavior  in  the  underlying  spectrum 
is  suspected  (as  may  be  made  evident  from  a  preliminary  Blackman-  Tukey 
estimate) . 

To  establish  the  fact  that  the  Blackman- Tukey  approach  to  spectral 
estimation  is  of  a  moving  average  structure,  it  is  possible  to  give  yet 
another  equivalent  representation  to  the  MA(q)  expression  (1.7).  This  will 


6 


entail  explicitly  carrying  out  the  indicated  polynomial  product 
Bq(eJw)  Bq(eitt)  thereby  giving 

4 

Sj|x(ei“)  =  Y  °n  e_^wn  (1.9) 

n=-q 

in  which  the  complex  conjugate  symmetric  cn  parameters  are  related  to  the 
original  bn  parameters  according  to 

4 

cn  =  5  bk  bk-n  _4i&i4  (1.10) 

k=0 

Upon  setting  the  cn  equal  to  w(n)rx(n),  it  is  apparent  .at  the 
Blackman-Tukey  estimate  (1.6)  is  a  special  form  MA(q)  model.  This  fact  is 
usually  overlooked  by  investigators  who  have  considered  the  Blackman- Tukey 
method  as  well  as  the  periodogram  as  nonparametric  spectral  estimators.  When 
viewed  from  the  approach  here  taken,  however,  each  of  these  procedures  is 
recognized  as  being  a  realization  of  a  NA  parametric  model. 

AR  Model 

When  we  compare  the  MA(q)  spectral  model  expression  (1.9)  with  the 
theoretical  power  spectral  density  function  (1.2)  which  is  being  estimated, 
it  is  apparent  that  a  serious  modeling  mismatch  can  arise  whenever  the 
underlying  autocorrelation  lags  are  such  that  the  rx(n)  are  not  approximately 
equal  to  zero  for  n  >  q.  For  example,  this  undesirable  condition  arises  when 
the  time  series  under  study  is  composed  of  sinusoids  in  white  noise. 
Conversely,  this  condition  does  not  arise  for  broad  band  signals.  The 
sinusoid  example  is  mentioned  since  it  forms  one  of  the  more  interesting 
special  case  time  series  to  which  spectral  estimation  techniques  are  applied. 
A  special  treatment  of  the  sinusoids  in  white  noise  case  will  be  given  in 
Section  III. 

In  recognition  of  this  potential  shortcoming  of  MA  models,  investigators 
have  examined  alternate  rational  spectral  models  which  do  not  invoke  the 
unnecessarily  harsh  requirement  of  a  truncated  autocorrelation  lag  behavior. 


Undoubtedly,  the  most  widely  need  of  snch  models  is  the  AS  model.  Nsaely,  a 
spectral  model  is  said  to  be  an  autoregressive  model  of  order  p  (i.e.,  AR(p)) 
if  it  may  be  put  into  the  fora 

»>0 _  I2 

-jp»!  (l.ii) 


sAR(ej“) 


1  +  ale-j«*  +  a2e“j2“  +  ...  +  ape 
lb„|2 


lAp(ei“) I 

This  AR(p)  model  has  a  functional  behavior  which  is  completely  characterized 
by  its  p+1  parameters  b0,  aj ,  a2»  ....  ap.  The  characteristic  pth  order 
polynomial  ApCeJ®)  is  seen  to  influence  the  frequency  behavior  of  the 
estimate  while  the  parameter  b0  controls  the  level. 

As  in  the  MA  model  case,  valuable  insight  into  the  capabilities  of  AS 
modeling  is  provided  upon  factoring  the  polynomial  A^ei").  This  is  found  to 
result  in  the  equivalent  representation 


SARCej") 


lb0l 


X 


(1.12) 


(l-pke-j“)(l-pkejw) 


where  the  pk  are  the  roots  of  Ap(ej<,,).  The  poles  of  this  AR  spectral  model 
are  seen  to  occur  in  reciprocal  pairs.  For  reasons  which  are  self  evident, 
the  AR(p)  spectral  model  is  also  commonly  referred  to  as  an  all-pole  model. 
As  such,  it  is  particularly  appropriate  for  modeling  spectra  which  contain 
sharply  defined  peaks  (pole  like  behavior),  but,  do  not  contain  sharply 
defined  notches.  If  a  spectrum  does  possess  notches,  however,  it  is  possible 
to  simulate  their  effect  at  the  cost  of  many  additional  poles  (i.e.,  a  high 
AR  order).  In  terms  of  parameter  parsimony.  It  is  therefore  prudent  to  avoid 
AR  models  whenever  notches  in  the  underlying  spectrum  are  suspected  (this  may 
be  made  evident  from  a  preliminary  Blackman-Tukey  estimate). 

Autoregressive  models  were  used  by  Yule  [66]  and  Walker  [63]  in 
forecasting  trends  of  economically  based  time  series.  These  models  were  then 
employed  by  Burg  [13]  in  1967  and  Parzen  [S3]  in  1968  to  achieve  spectral 
estimates  which  did  not  possess  the  aforementioned  deficiencies  of  the  MA 
model.  The  Burg  method  is  of  particular  interest  since  it  offered  s  new 


p 


insight  into  spectral  modeling  and  introduced  a  number  of  concepts  that  are 
now  standard  tools  of  spectral  estimation.  This  includes  an  efficient 
lattice  structured  implementation  of  the  Burg  method  which  has  sinoe  been 
examined  and  advanced  by  many  investigators  (e.g.,  see  ref.  [44]).  It  is  not 
an  exaggeration  to  say  that  Burg's  method  gave  rise  to  a  literal  explosion  in 
research  activity  directed  towards  evolving  improved  rational  modeling 
methods . 


ABBA  Models 

In  many  applications,  the  underlying  power  spectral  density  function 
will  contain  both  notch  and  peak  like  behavior.  As  such,  neither  the  MA  nor 
the  AR  model  is  the  most  appropriate  model  representation  from  a  parameter 
parsimony  view  point.  The  more  general  rational  model  (l.S),  however,  is 
capable  of  efficiently  representing  such  behavior.  This  most  general 
rational  model  is  commonly  referred  to  as  an  autoregressive-moving  average 
model  of  order  (p,q)  (i.e.,  ARMA  (p.q))  with  its  frequency  characterization 
being  given  by 

1,2 

b„  +  bie~jw  +  ...  +  bqe“jQwJ 

1  +  aie-j«  +  ...  +  ape-jP»  I 
|Bq(eJ«)|2 

lAp(eJ w) I  (1.13) 

An  ARMA  model  is  seen  to  have  a  frequency  characterization  which  is  the 
composite  of  a  MA  and  an  AR  model.  To  further  reinforce  this  interpretation, 
we  have  the  following  equivalent  representation  upon  factoring  the 
polynomials  Ap(ejw)  and  Bq(eJw)  which  characterize  its  frequency  behavior 

q 

TT  (1  -  zk  e-J“)(l-IkeJ») 

sARMA(e^w>  =  lb0l2  _ 

P 

TT  (1  -  Pk  e-J“)(l  -  pk  eJ»)  (1.14) 

k=l 
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An  ARMA  model  it  teen  to  pottett  q  zeroes  and  p  polet,  end,  at  such  it  it 
generally  a  such  more  effective  model  than  are  itt  nor e  apecialized  MA  (all 
zero)  and  AR  (all  pole)  model  counterpart*.  Thete  polet  and  zeroea  are  aaen 
to  occur  in  reciprocal  pairt. 

Although  ARMA  model  t  are  the  moat  preferable  choice  for  moat 
application!,  many  practitioner!  have  opted  to  utilize  either  MA  or  AR 
modelt.  There  is  an  increating  avarenett,  however,  of  the  general 
superiority  of  ARMA  modeling.  This  hat  given  rite  to  a  renewed  effort  to 
generate  computationally  efficient  ARMA  modeling  algorithms.  A  particularly 
effective  approach  to  ARMA  modeling  will  be  presented  in  this  paper. 
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II.  Rational  Modeling  -  Exact  Autocorrelation  Knowledge 

In  this  section,  the  theoretical  antocorrelation  characteristics  of  HA, 
AS  and  ARMA  random  processes  are  examined  separately.  This  characterization 
will  in  torn  enable  ns  to  intelligently  select  the  most  appropriate  rational 
model  which  best  represents  a  given  set  of  exact  autocorrelation  lags 

*  Tj(l) ,  .  .  .,  Tj(s)  (2.1) 

Moreover,  a  systematic  procedure  for  identifying  the  selected  model's 
parameters  from  these  given  autocorrelation  lag  values  is  also  developed. 
Although  the  assumption  here  made  of  exact  autocorrelation  information  is 
highly  idealistic  and  almost  never  met  in  applications,  the  insight  thereby 
provided  is  helpful  when  considering  the  more  practical  problem  of  generating 
rational  model  estimates  from  raw  time  series  observations. 

To  begin  this  analysis,  it  will  be  hereafter  assumed  that  the  time 
series  under  examination  is  generated  (or  can  be  adequately  modeled)  as  the 
response  associated  with  the  linear  operator 

P  q 

x(n)  +  Y  *k  *(n-k)  =  \  bk  e(n-k)  (2.2) 

k=l  k=0 

in  which  the  excitation  time  series  (s(n))  is  taken  to  be  a  sequence  of  zero 
mean,  unit  variance,  uncorrelated  random  variables  (i.e.,  normalized  white 
noise)  that  is  taken  to  be  unobservable.  This  excitation-response  behavior 
is  depicted  in  Figure  2.1.  Using  standard  techniques,  it  is  readily  shown 
that  the  power  spectral  density  function  associated  with  the  response  time 
series  is  given  by  the  ARMA(p,q)  rational  form 

12 
b0  +  hi  e" Jw  +  ...  +  bq  e-jq«  I 

1  +  «1  e-j*>  +  ...  +  Bp  e-jP«  I 


Thus,  there  is  an  equivalency  between  an  assumed  ARMA  (p,q)  spectral  model, 
and,  the  response  of  the  recursive  linear  operator  (2.2)  to  white  noise.  In 
this  section,  the  required  rational  modeling  will  be  developed  through  use  of 
the  time  series  description  (2.2)  and  its  associated  autocorrelation 
characterization.  It  is  interesting  to  note  that  most  available  rational 
spectral  estimation  techniques  are  based  upon  a  time  domain  characterization. 
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Figure  2.1.  Model  of  rational  tine  series. 

The  mechanism  for  effecting  the  required  rational  modeling  are  the 
so-called  Yule-Walker  equations  which  govern  linear  relationship  (2.2). 
Namely,  upon  multiplying  both  sides  of  this  relationship  by  x(n-m)  and  then 
taking  expected  values,  it  is  found  that  the  Yule-Walker  equations 

P  <1 

^  akrx(n-k)  =  ^  bi  h(i-n)  (2.3) 

k=0  i=0 

arise  where  aQ  =  1.  The  entity  h(n)  herein  used  corresponds  to  the 

unit-impulse  (i.e.,  Kronecker  delta)  response  of  linear  operator  (2.2).  This 
unit-impulse  response  may  also  be  interpreted  as  being  the  inverse  Fourier 
transform  of  the  linear  operator's  transfer  function  Bq(e ju>)/Ap(eju>) .  In 
what  is  to  follow,  it  will  be  assumed  that  this  linear  operator  is  causal 
thereby  implying  that  h(n)  =  0  for  n  negative.  Although  this  assumption  is 
not  essential  in  the  analysis  which  follows,  it  is  here  imposed  in 
recognition  of  the  fact  that  most  applications  are  inherently  involved  with 
causal  operations.  Adaption  to  the  case  where  noncausal  operations  are  more 
appropriate  is  straightforward  and  will  not  be  given. 

The  Yule-Walker  equations  (2.3)  take  on  a  particularly  simple  form  when 
the  linear  operator  (2.2)  which  they  describe  is  constrained  to  be  a  MA  or  an 
AR  linear  operator.  To  delineate  this  fact,  we  shall  now  examine  separately 
the  basic  characteristics  of  the  Yule-Walker  equations  when  the  underlying 
linear  model  is  taken  to  be  MA,  AR,  and  ARMA. 
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MA  Time  Series 

The  time  series  {x(n}}  is  said  to  be  a  moving  average  random  process  if 
it  is  generated  according  to  the  linear  nonrecnrsive  relationship 


q 

x(n)  =  ^  bk  e(n-k)  (2.4) 

k=0 


where  (s(n)}  is  the  af orenentioned  normalized  white  noise  excitation  process. 
According  to  the  general  Yule-Walker  equations  (2.3),  the  response's 
autocorrelation  sequence  is  therefore  specified  by 


rx(n)  « 


-q  i  a  i  q 


otherwise 


(2.5) 


where  use  of  the  facts  that  ak  =  0,  and,  h(n)  =  bn  for  0  <  n  £  q  have  been 
incorporated.  Thus,  the  autocorrelation  sequence  associated  with  a  moving 
average  process  is  seen  to  be  of  finite  length  (i.e.,  2q+l)  with  the  length 
identifying  the  order  of  the  MA(q)  process. 

We  shall  now  consider  the  problem  of  identifying  the  MA  parameters  bk 
which  correspond  to  a  given  2q+l  length  autocorrelation  sequence  rx(n)  for 
~qlni.q>  This  identification  will  be  made  by  examining  the  spectral  density 
function  associated  with  the  autocorrelation  sequence.  In  particular,  upon 
taking  the  z-transform  (in  lieu  of  the  Fourier  transform)  of  the  given  2q+l 
length  autocorrelation  sequence,  we  have  upon  using  relationship  (2.5) 


Sx(z)  =  ^  rx(n)z~n 

n=-q 

q  q 

=  ^  ^  bk  bk_n  z-“ 

n=-q  k=0 

q  q 

=  5  bk  *"k  5  bm  (2-6) 

k-0  .  m-0 
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Since  the  finite  power  series  Sx(z)  has  complex  conjugate  coefficients  (i.e.. 


rx(-n)  =  rx(n)),  it  follows  that  the  zeroes  of  this  power  series  must  occur 
in  reciprocal  pairs.  With  this  in  Bind,  it  is  therefore  always  possible  to 
factor  the  power  spectral  density  function  as 

q 

Sx(z)  =  a2  T7  (l-Ziz-iHl-Zkz)  (2.7) 

k=l 

where  a  is  a  real  scalar.  Upon  coaparing  expressions  (2.6)  and  (2.7),  it  is 
apparent  that 

q  q 

^  bj.  z-k  =  a  "|"J"  (l-z^z-*)  (2.8) 

k=0  k=l 

Thus,  the  required  bk  parameter  identification  is  achieved  by  carrying  out 
the  right  side  multiplications  in  expression  (2.8)  and  then  equating 
coefficients  of  equal  powers  of  z~kt  The  most  critical  step  of  this 
identification  procedure  is  the  factorization  of  the  known  power  series  Sx(z) 
as  given  in  equation  (2.7) . 

One  point  of  caution  should  be  raised  in  following  this  approach.  It 
arises  due  to  the  fact  that  although  the  factorization  of  Sx(z)  into  its  2q 
first  order  product  terms  is  unique,  the  decomposition  (2.7)  is  certainly 
not.  This  is  a  direct  consequence  of  the  appearance  of  the  roots  of  Sx(x)  in 
reciprocal  pairs.  Thus,  the  term  (1-Xiz-*)  may  be  replaced  by  (l-zi-lz-1)  in 
expression  (2.8)  without  destroying  the  required  structure  (2.6) .  This 
replacement,  however,  will  in  general  lead  to  a  different  set  of  b^ 
parameters.  Since  there  are  typically  q  different  first  order  reciprocal 
pairs  in  the  factorization  (2.7),  it  then  follows  that  there  are  24  different 
bn  parameter  sets  which  are  compatible  with  the  autocorrelation  identity 
(2.5).  The  one  normally  chosen  corresponds  to  the  so-called  minimum  delay 
selection  in  which  the  z^  roots  used  in  expression  (2.8)  are  selected  so  that 
they  all  have  magnitudes  less  than  or  equal  to  one. 


AR  Time  Serial 

The  time  series  (x(n)}  is  said  to  be  an  autoregressive  (AR)  process  of 
order  p  if  it  is  generated  according  to  the  recursive  relationship 


x(n)  +  ^  ak  x(n-k)  =  b0e(n) 
k=l 


(2.9) 


where  (s(n))  is  the  aforementioned  normalized  white  noise  process.  The 
Yule-Walker  equations  (2.3)  indicate  that  the  AR(p)  autocorrelation  elementa 
are  related  by 


rx(n)  +  ^  *k  rx(n-k)  = 

k=l 


Ibol2 

0 


bpO 

nil 


(2.10) 


where  use  of  the  facts  that  h(0)  =  b0  and  h(n)  =  0  for  n  <  0  have  been  made. 

In  order  to  effect  a  direct  procedure  for  identifying  the  AR(p)  model's 
p+1  parameters  aj,  S2»  ....  ap,  b0  which  best  represent  the  set  of 
autocorrelation  lag  values  (2.1),  one  may  evaluate  the  first  p+1  of  the 
governing  Yule-Walker  equations.  This  evaluation  when  put  into  a  matrix 
format  takes  the  form 


or  more  compactly  as 

R  a  -  |b0|2  «!  (2.11b) 

In  this  expression,  R  is  the  (p+l)x(p+l)  AR  autocorrelation  matrix  whose 
elements  are  given  by 

RU.j)  =  rx(i-j)  1  i  i  i  P+1  (2.12) 

1  i  j  i  P+1 

a  is  the  (p+l)xl  autoregressive  parameter  vector  with  first  component  equal 
to  one,  that  is 

I  “  (1»®1,  S2 ,  ...,  Sp]  (2.13) 

and  ej  is  the  (p+l)xl  standard  basis  vector  whose  elements  are  all  zero 
except  for  its  first  which  is  one.  The  required  parameter  identification  is 
then  obtained  upon  solving  this  system  of  p+1  linear  equations  in  the  p+1 
unknowns.  Conceptually,  this  solution  may  be  effected  by  performing  the 
following  computation 

a  =  |bc|2  R'l  ej  (2.14) 

in  which  the  normalizing  coefficient  b0  is  selected  so  that  the  first 
component  of  a  is  one  as  required  in  expression  (2.13).  In  this  solution 
procedure,  we  are  tacitly  assuming  the  invertibil i ty  of  the  autocorrelation 
matrix  R.  If  matrix  R  is  singular,  however,  this  almost  always  implies  that 
the  underlying  time  series  is  an  autoregressive  process  of  order  less  than  p. 
In  this  case,  it  will  be  necessary  to  decrease  the  order  until  R  becomes 
invertible. 

Upon  examination  of  expression  (2.11),  it  is  seen  that  the  resultant 
AR(p)  model  parameters  are  totally  dependent  on  the  first  p+1  given 
autocorrelation  lags  rx(0) ,  rx(l) ,  ....  rx(p).  Although  the  associated  model 
will  have  an  autocorrelation  behavior  which  perfectly  matches  these  first  p+1 
lags,  it  may  provide  a  very  poor  representation  for  the  remaining  given 
autocorrelation  lags  rx(p+l),  rx(p+2),  ...,  rx(s)  (which  were  not  used  in  the 
parameter  identification).  In  order  to  provide  a  represention  for  these 
higher  lags  by  the  procedure  here  taken,  it  may  be  necessary  to  increase  the 
AR  model  order  to  s  (i.e.,  p=s).  In  many  applications,  however,  the 
underlying  goal  will  be  that  of  providing  an  AR  model  of  relatively  low  order 
(i.e.,  p<<s)  which  will  adequately  represent  the  entire  set  of 
autocorrelation  lags.  A  procedure  for  achieving  this  objective  will  be 
shortly  given.  Before  considering  this  most  relevant  objective,  let  us  first 
outline  an  elegant  method  for  solving  the  system  of  linear  equations  (2.11). 


LEV  IN  SON -DURBIN  ALGORITHM:  Although  the  solution  procedure  as  embodied 
in  expression  (2.14)  will  result  in  the  desired  parameter  identification,  the 
evaluation  of  R”*  will  entail  on  the  order  of  p3  multiplication  and  addition 
calculations  (i.e.,  o(p3))  if  standard  procedures  such  as  Gaussian 

elimination  are  used.  Fortunately,  it  is  possible  to  take  advantage  of  the 
fact  that  the  autocorrelation  matrix  R  is  both  complex  conjugate  symmetric 
(i.e.,  R(i,j)  =  R( j , i) )  and  Toeplitz  (i.e.,  R(f,j)  =  R(i+l,j+l))  so  as  to 
effect  a  computationally  efficient  solution  procedure.  This  method  was 
developed  by  Levinson  and  is  commonly  referred  to  as  the  Levinson-Durbin 
algorithm  [24], [43].  In  this  approach,  one  solves  the  linear  system  of 
equations  (2.11)  as  the  AR  order  parameter  p  is  sequenced  through  the  values 
1,  2,  3,  ....  pm  where  pn  designates  some  as  yet  unknown  maximum  AR  order. 
In  this  sequencing  scheme,  Levinson  showed  that  the  parameters  for  the  kth 
order  AR  model  solution  which  are  designated  by 

aj/k^,  ...»  *k^^»  b0(k)  (2.15) 

are  related  to  the  (k-l)th  order  AR  model  solution  as  outlined  in  Table 
2.1.  A  brief  description  of  this  systematic  algorithm  will  now  be 
given. 


Step  1 

«1(1)  =  ~rx(l)/rx(0) 

lb0(1) I2  «  [1  -  ia!(1) I2]  rx{0) 

(2.16a) 

(2.16b) 

Step  2 

For  k  =  2  #  3  ,  4#  •  •  • 

k-1 

ak<k>  =  -  rx(k)  +  ^  am<k_1)  rx(k-m)  /lb0(k_1)|2 
idfI 

(2.17a) 

«i(k)  =  al(k_1)  +  ak(k)  aikI1>  1  i  i  i  k-1 

(2.17b) 

|b0(kV  =  [1  -  l.k<k)|2]lb0<k_1)|2 

(2.17c) 

Table  2.1.  Levinson-Durbin  Algorithm  for 

Recursively  Solving  Expression  (2.11) 
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If  one  were  to  solve  the  linesr  system  of  equations  (2.11)  for  the  order 
choice  p  =  1 ,  it  would  be  found  that  the  required  first  order  AS  parameters 
(with  superscript  (1)  appended)  are  given  in  step  1  of  Table  2.1.  Upon 
setting  p  -  2  in  expression  (2.11),  a  moderate  amount  of  algebraic 
manipulation  will  reveal  the  validity  of  the  solution  as  given  in  Step  2  of 
Table  2.1  with  k  =  2  (with  superscript  (2)  appended).  Levinson  proved  that 
in  following  the  systematic  procedure  of  Table  2.1,  the  solutions  to  the 
Yule-Walker  equation  (2.11)  for  order  selections 

p  -  1,  2,  3,  ...  are  sequentially  obtained.  Moreover,  the  number  of 

multiplication  (and  addition)  computations  required  in  generating  the  k*b  AS 
order  parameters  from  the  k-lst  AR  order  parameters  (i.e..  Step  2)  is  seen  to 
be  k.  Thus,  the  computational  complexity  of  the  Levinson-Durbin  algorithm 
for  generating  a  p*h  order  AS  model  (and  all  lower  order  models  as  a 
byproduct)  is  found  to  be  o(p^)  .  This  is  a  considerable  savings  over  the 
computational  complexity  of  o(p3)  required  in  solving  expression  (2.11)  using 
standard  techniques. 

The  Levinson-Durbin  algorithm  provides  not  only  a  computationally 
efficient  method  for  generating  the  AR  parameters,  but,  it  also  yields  an 
effective  AR  model  order  determination  procedure.  Specifically,  let  it  be 
assumed  that  the  autocorrelation  lags  used  in  expression  (2.11)  correspond  to 
an  AR(p)  process.  If  the  Levinson-Durbin  algorithm  were  applied  to  this 
autocorrelation  lag  information,  by  the  very  nature  of  this  procedure,  the  AR 
process  parameters  would  be  perfectly  identified  at  the  ptk  iteration  (i.e., 

(p)  .  (p) .2  i  2 1 

sv  =avk  =  l,  2,  ....  p  and  |bQ  *  \  =  |bn  I).  Moreover,  if  this 

(k) 

recursion  were  continued  beyond  p,  it  would  be  found  that  a^  =  for  1  <. 

i  i  P,  a^k^  =  0  for  p+1  i  i  i  k,  and,  lb0^kM  =  |b0|  .  This  is  a  direct 

consequence  of  the  fact  that  ap+j(P+D  must  be  zero  as  is  evident  from 

expression  (2.17a).  From  these  observations,  it  is  therefore  apparent  that 

2 

the  nonchanging  of  the  parameters  |b0(k)|  provides  a  means  for  order 
determination. 

When  the  autocorrelation  lags  used  do  not  correspond  to  an  AR  process, 

i  (k>,2 

there  will  be  no  value  of  k  for  which  |b0  |  assumes  a  constant  value 

thereafter.  Since  the  specific  high  order  coefficients  ak^  will  always 
have  a  magnitude  which  never  exceeds  one  [5]  and  [14],  however,  it  is 
apparent  from  expression  (2.17c)  that  lb0^k^l  <_  |b0^k_1^  I  for  all  k  2.  1. 


Thus,  the  parameters  I b 0 ' ^ ^ I  '  form  a  monotonically  nonincreasing  sequence  and 
this  factor  can  be  used  in  model  order  determination.  In  particular,  the 
parameter  lb0^|  may  be  identified  with  a  'prediction  error'  associated 
with  a  kth  order  linear  predictor.  Once  this  prediction  error  becomes 
satisfactorily  small,  the  associated  AR(p)  model  will  form  an  acceptably  good 
approximation  to  the  given  autocorrelation  sequence  (e.g.,  see  ref.  [31]). 
The  meaning  of  ’satisfactorily  small*  is  subjective  and  will  depend  on  the 
particular  application  being  considered  and  empirically  obtained  experience. 

The  parameters  a^(k)  for  k  =  1,  2,  3,  ...  are  also  referred  to  as 
'reflection  coefficients'  and  are  often  denoted  by  c^  =  »^(k) .  These 
reflection  coefficients  have  the  property  that  for  the  truncated  sequence 
rx(0),  rz(l),  ...,  rx(p)  to  be  a  valid  segment  of  an  autocorrelation 
sequence,  it  is  necessary  and  sufficient  that  |c^|  1  for  k  =  1,  2,  ....  p. 

Moreover,  the  transfer  function 

P 

Ap(z)  =  }  «n  z_n  (2-18) 

n=0 


associated  with  the  solution  to  expression  (2.11)  will  have  all  of  its  roots 
on  or  inside  the  unit  circle  if  and  only  if  the  Ic^l  £  1  for  k  *  1,2,  . . . ,  p. 

It  is  noteworthy  that  the  system  of  equations  (2.11)  also  arise  when 
solving  the  optimum  one-step  predictor  problem,  or,  when  using  the  maximum 
entropy  principle  [31] ^  In  the  one-step  predictor  problem,  it  is  desired  to 
select  the  p  predictor  parameters  a^  so  that  the  prediction 

P 

x(n)  =  -  ^  ak  x(n-k)  (2.19) 

k=l 


best  approxiaiates  x(n)  in  the  sense  of  minimizing  the  mean  squared  prediction 
error  E{ |x(n)-x(n) |2) .  One  may  readily  show  that  the  optimum  prediction 
parameters  are  found  by  solving  expression  (2.11)  in  which  l^o  I  Pl*y*  t*1® 
role  of  the  minimum  mean  squared  prediction  error.  On  the  other  hand,  when 
applying  the  maximum  entropy  principle,  it  is  tacitly  assumed  that  the  time 
series  (x(n)}  is  a  zero  mean,  Gaussian  process.  The  objective  is  to  then 
find  a  power  spectral  density  function  Sx(eJw)  which  will  maximize  the 
entropy  measure 
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Jl 

J  jfn  [Sx(ej«)  ]  d«  (2.20) 

-n 


subject  to  the  constraint  that  this  function  will  be  consistent  with  the 
given  set  of  p+1  autocorrelation  lags  rx(0),  rx(l),  ....  rx(p)  through  the 
Fourier  transform  pair  relationship  (1.3).  It  is  readily  shown  that  the 
maximizing  power  spectral  density  function  is  an  AS  process  of  order  p  whose 
parameters  are  given  by  expression  (2.11). 

ASHA  Time  Series 

The  time  series  (x(n)}  is  said  to  be  an  autoregressive-moving  average 
(ARMA)  process  of  order  (p,q)  if  it  is  generated  (or  can  be  modeled) 
according  to  the  recursive  relationship 
P  1 

*(n)  +  ^  ak  x(n-k)  =  ^  bk  e(n-k)  (2.21) 

k=l  k=0 

in  which  the  excitation  sequence  (s(n)}  is  the  aforementioned  normalized 
white  noise  process.  Our  task  is  to  then  determine  values  for  the  ak  and  bk 
parameters  of  this  model  which  are  most  compatible  with  the  given 
autocorrelation  lags  (2.1).  The  mechanism  for  measuring  this  compatibility 
will  be  the  Yule-Walker  equations  (2.3)  which  characterize  the  above  ASHA 
model.  Upon  examination  of  these  equations,  it  is  seen  that  the  ARMA 
parameters  appear  in  a  nonlinear  fashion  through  the  unit-impulse  response 
h(n).  If  the  best  least  squares  modeling  is  desired,  it  is  then  found  that 
the  generation  of  the  optimal  ak,  bk  parameters  involves  the  least  mean 
square  solution  of  the  highly  nonlinear  Yule-Walker  equations.  This  will 
almost  always  necessitate  the  use  of  computationally  burdensome  nonlinear 
programming  algorithms  with  the  attendant  difficulty  of  initial  parameter 
value  selection,  and,  the  possibilities  of  convergence  to  a  local  extrema  or 
even  nonconvergence. 

A  considerable  easing  in  computational  requirements  may  be  achieved  if 
we  allow  ourselves  the  luxury  of  evaluating  the  ak  and  bk  parameters 
separately.  By  using  this  approach,  it  will  be  possible  to  provide  for  a 
linear  solution  procedure  for  the  ak  parameters.  Although  this  approach  will 
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be  suboptimal  in  nature,  it  often  provides  for  a  near  optimal  modeling.  The 
mechanism  for  this  separate  parameter  evaluation  is  obtained  upon  examining 
the  Yule-Walker  equations  (2.3)  which  characterizes  the  ARMA  model  (2.21). 
If  this  model  is  take  to  be  causal,  it  follows  that  the  Yule-Walker  equations 
assume  a  particularly  simple  form  for  indices  n  >  q,  that  is 

P 

^  ak  rx(n-k)  =  0  for  n  X  q+1  (2.22) 

k=0 

We  shall  refer  to  this  particular  subset  of  the  Yule-Walker  equations  as  the 
extended  Yule-Walker  equations.  The  obvious  attractiveness  of  these 
equations  lies  in  the  fact  that  they  are  1 inear  in  the  a^  parameters. 

To  determine  the  a^  autoregressive  parameters  which  are  most  compatible 

with  the  given  set  of  autocorrelation  lags  (2.21),  we  could  adopt  the 

approach  that  characterized  extended  Yule-Walker  equation  AR  and  ARMA 
modeling  methods  up  to  as  recently  as  three  years  ago  (e.g.,  see  refs 

[26] , [28] , [35] , [38] ) .  This  would  entail  evaluating  the  first  p  extended 
Yule-Walker  equations  (i.e.,  q+1  <.  n  £  q+p)  and  then  solving  the  resultant 
system  of  p  linear  equations  in  the  p  auto-regressive  parameters.  Although 
this  approach  is  computationally  attractive,  it  suffers  from  the  obvious 
drawback  that  only  a  subset  of  the  given  autocorrelation  lags  (2.1)  are  being 
used  in  fixing  the  a^  parameters  (i.e.,  rx(n)  for  q-p  <  n  <.  q+p).  To  achieve 
a  ARMA  model  which  better  represents  the  entire  set  of  autocorrelation  lags 
(2.1),  it  is  clearly  beneficial  to  use  more  than  the  minimal  number  (i.e.,  p) 
of  extended  Yule-Walker  equation  evaluations.  The  aj.  parameters  which  yield 
a  least  squares  fit  to  this  overdetermined  set  of  linear  equations  is  then 
found  using  a  straightforward  procedure  to  be  shortly  given. 

This  overdetermined  extended  Yule-Walker  equation  approach  to  ARMA 
spectral  estimation  was  proposed  by  the  author  in  1979  [15].  From  a 

historical  perspective,  it  is  to  be  noted  that  the  idea  of  using  an  extended 
set  of  model  evaluations  forms  a  fundamental  concept  in  system  parameter 
estimation  theory  (e.g.,  see  refs.  [45], [59]).  Moreover,  the  approach  here 
taken  can  be  interpreted  as  being  a  generalized  application  of  the  Prony 
procedure  in  which  the  autocorrelation  lags  play  the  role  of  the  data.  With 
these  thoughts  in  mind,  there  exists  a  rich  source  of  evidence  justifying  the 
use  of  an  overdetermined  set  of 
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extended  Yule-Walker  equations  for  estimating  the  ARMA  model's  autoregressive 
parameters. 

In  this  overdetermined  modeling  approach,  the  extended  Ynle-Walker 
equations  (2.22)  are  evaluated  for  t  distinct  values  of  n  satisfying  n  q+1 . 
To  effect  the  desired  overdeterminacy,  the  integer  t  has  to  be  selected  to  at 
least  equal  p+1  although  larger  values  will  typically  yield  better  model 
representations.  To  illustrate  this  overdetermined  approach,  let  us  consider 
the  first  t  extended  Yule-Walker  equations  (2.22)  indexed  by  q+1  <.  n  £  q+t. 
This  particular  Yule-Walker  equation  evaluation  gives  rise  to  the  following 
overdetermined  system  of  t  linear  equations  in  the  p  autoregressive  parameter 
unknowns^ 


rx ( q+1 )  rx(q) 


rx(q+2)  rx ( q+1 ) 


rx(<5t“P+!) 


rx( Q~P+2) 


rx(q+t)  rx(q+t-l)  ...  rx(q-p+t) 


1 

al 

a2 


(2.23a) 


or  more  compactly  as 

Rj  a  =  e  (2.23b) 

In  this  latter  expression,  e  denotes  the  txl  zero  vector,  Rj  is  the  tx(p+l) 
ARMA  autocorrelation  matrix  with  Toeplitz  type  structure  having  elements 

Rj.(  i,  j)  =  rx(q+l+i-j)  1  <  i  1  t  (2.24) 

1  i  j  i  P+1 


lln  certain  applications,  it  may  be  desirable  to  use  an  other  than 
contiguous  set  of  extended  Yule-Walker  equation  evaluations. 


22 


and  £  is  the  (p+1)  autoregressive  parameter  vector  whose  first  component  is 
required  to  be  one 

i  a  1 1  *  *  aj ,  • • • i  Sp ] 1  (2  *25) 

Examination  of  relationships  (2.23)  reveals  that  the  ARMA  model's 
autoregressive  parameters  are  obtained  upon  solving  this  system  of  t 
overdetermined  (assuming  t  >  p)  linear  equations.  Due  to  the  overdetermined 
nature  of  these  equation,  the  fundamental  question  as  to  whether  a  solution 
exists  naturally  arises.  The  following  theorem  provides  an  answer  to  this 
question  and  is  a  direct  result  of  the  Yule-Walker  equations  which  governs 
ARMA  processes. 

Theorem  2.1:  If  the  autocorrelation  lag  entires  used  in  matrix  Rj  of 
expression  (2.23)  correspond  to  those  of  an  ARMA  (Pi.qj)  process,  then 
the  rank  of  Rj  is  pj  provided  that  p  Pl»  q  2.  41  • 

With  this  theorem  in  mind,  the  existence  of  a  solution  to  relationship  (2.23) 
will  be  dependent  on  the  rank  of  the  autocorrelation  matrix  Rj .  We  shall  now 
consider  separately  the  cases  in  which  Rj  has  full  rank  and  less  than  full 
rank. 


Rank  [Rj]  £  p:  When  the  rank  of  matrix  Ri  has  less  than  full  rank,  a 
nontrival  autoregressive  parameteric  vector  solution  a  will  be  assured.  An 
interesting  algebraic  characterization  of  this  solution  may  be  obtained  upon 
premultiplying  both  sides  of  relationship  (2.23)  by  the  complex  conjuage 
transpose  of  Rj  as  denoted  by  R^  to  yield 

R1R1  «  =  •  (2.26) 
Upon  examination  of  this  expression,  it  is  clear  that  the  required 
autoregressive  parameter  vector  may  be  also  identified  with  a  properly 
normalized  eigenvector  (i.e.,  its  first  component  is  one)  associated  with  a 
zero  eigenvalue  of  the  (p+1  )x (p+1)  matrix  RjRj .  As  such,  we  may  then  use 
standard  eigenvector-eigenvalue  routines  when  finding  the  required  ARMA  model 
autoregressive  parameters. 

Rank  [Rj]  =  p+1:  In  many  cases  of  interest,  however,  it  will  be  found 
that  the  autocorrelation  matrix  Rj  will  have  full  rank.  This  will  occur 
whenever  the  autocorrelation  lag  entries  used  are  associated  with  either  a 
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nonrational  random  time  series,  an  MA  process,  or,  with  a  higher  order  ARMA 
rational  process.  Since  has  fnll  rank,  there  then  will  not  eziat  a 
nontrivial  solution  to  relationship  (2.23).  Nonetheless,  we  still  wish  to 
determine  an  ARMA  model  which  'best  fits'  these  overdetermined  extended 
Ynl e-Walker  equations.  Namely,  we  seek  a  nonzero  antoregressive  parameter 
vector  a  so  that  Rja  most  closely  equals  the  required  ideal  zero  vector  as 
specified  in  (2.23).  Although  a  variety  of  procedures  may  be  used  for 
accomplishing  this  selection,  the  following  two  approaches  typify  many 
spectral  estimation  algorithms. 

(i)  In  the  first  selection  procedure,  it  is  desired  to  find  an 
autoregressive  parameter  vector  lying  on  the  unit  hyper  sphere  which  will 
minimize  the  Euclidean  norm  of  R^a.  This  entails  solving  the  following 
contrained  optimization  problem 

min  a^RjR^ 
ji*a  =  1 


Using  standard  Lagrange  multiplier  concepts,  it  is  readily  shown  that  the 
solution  to  this  optimization  problem  is  obtained  by  selecting  that 
orthonormal  eigenvector  of  the  positive  definite  Hermetian  matrix  ®JRj 
associated  with  its  minimum  eigenvalue.  If  x^  corresponds  to  that 
orthonormal  eigenvector  (i.e.,  RjRjifc  =  AfcXk  with  A*  £  Ak+1  and  x*x^  =  1)  , 
the  required  autoregressive  parameter  vector  with  first  component  of  one  if 
obtained  by  the  normalization. 

1 

a°  - - x.1  (2.27) 

xjd) 

where  xi(l)  denotes  the  first  component  of  xi .  This  autoregressive  parameter 
vector  selection  procedure  characterizes  many  spectral  algorithms  which  are 
varients  of  the  Pisarenko  method  [55]  and  is  generally  not  suitable  for  an 
efficient  computational  solution. 

(ii)  In  the  second  selection  procedure,  we  wish  to  minimize  the 
Euclidean  norm  of  Ria  over  all  (p+l)xl  vectors  a  with  first  components  equal 
to  one,  that  is 
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min 

a(l)=l 

Appealing  to  the  Lagrange  multiplier  approach  again,  it  ia  found  that  the 
solution  to  this  constrained  optimization  problem  is  given  by  solving  the 
following  linear  system  of  equations 

KjR^0  =  aet  (2.28) 

where  the  normalizing  constant  a  is  selected  so  that  the  first  component  of 
a<>  is  one. 

In  using  either  of  the  above  two  procedures,  we  are  seeking  to  best 
satisfy  theoretical  relationship  (2.23)  in  the  least  squares  sense  subject  to 
appropriate  constr aintsl .  The  particular  application  at  hand 

dictates  which  autoregressive  parameter  vector  selection  procedure  provides 
the  best  performance.  It  has  been  the  author's  experience  that  the  selection 
(2.27)  has  often  provided  reasonable  modeling  (also  see  ref.  [12]).  In  terms 
of  computational  efficiency,  however,  the  linear  selection  (2.28)  enjoys  a 
clear  superiority  due  to  the  availability  of  efficient  adaptive  algorithms  as 
outlined  in  Section  X.  With  this  in  mind,  we  shall  mainly  focus  our 

attention  on  the  linear  selection  (2.28)  . 

In  summary,  the  ARMA(p,q)  model  associated  with  a  given  set  of 
autoregressive  lags  entails  an  examination  of  the  matrix  Rj .  if  this  matrix 
is  not  of  full  rank,  the  required  exact  autoregressive  parameter  vector  will 
be  given  by  solving  expression  (2.26).  On  the  other  hand,  when  the  matrix 
has  full  rank,  an  appropriate  autoregressive  parameter  vector  may  be  achieved 
by  solving  either  expression  (2.27)  or  (2.28).  It  is  important  to  appreciate 
the  fact  that  these  ARMA  results  are  applicable  to  the  special  AR  process  ia 
which  case  we  simply  enter  q=0  when  forming  the  ARMA  autocorrelation  matrix 
Rl. 


1 It  is  possible  to  generalize  the  constraints  to  be  a  quadratic  surface 
(giving  rise  to  a  generalized  eigenvector  solution)  or  a  hyperplane, 
respectively  [10]. 


Moving  Average  Parameters 

In  order  to  complete  the  ARMA  modeling,  it  is  necessary  to  determine  the 
model's  associated  moving  average  parameters.  There  are  a  variety  of 
procedures  for  achieving  this  objective.  We  shall  present  two  snch 
procedures  of  which  the  first  is  the  one  most  often  found  in  the  literature 
while  the  second  possesses  a  desirable  efficient  computational 
implementation. 

(i)  In  the  first  procedure,  one  conceptually  applies  the  time  series 
(x(n)}  to  the  ptk  order  nonrecursive  filter  with  transfer  function  Ap(z) 
whose  coefficients  correspond  to  the  autoregressive  parameters  obtained  upon 
solving  either  expression  (2.2 6),  (2.27)  or  (2.28).  This  filtering  produces 
the  so-called  residual  time  series  as  specified  by 

P 

s<n>  =  ^  “mxU-m)  (2.29) 

BF=0 

This  filtering  causes  the  residual  time  series  to  be  a  moving  average  process 
of  order  q  with  power  spectral  density  function  |Bq(e^w)l^  as  is  made  evident 
from  Figure  2.2.  This  of  course  presumes  that  (x(n)}  corresponds  to  an  ARMA 
processor  of  order  (p.q)  or  less.  A  simple  analysis  indicates  that  the 
length  2q+l  autocorrelation  sequence  of  this  residual  time  series  may  be 
computed  according  to 

ak*m  rx(n+m-k) 


Using  these  MA(q)  autocorrelation  lags,  it  follows  from  expression  (2.5)  that 
the  unknown  b^  parameters  must  be  such  that 

q 

rs(n)  =  ^  bfc  bfc_n  -qin^q  (2.31) 

k=0 

A  spectral  factorization  along  the  lines  mentioned  in  this  section's  MA  time 
series  subsection  will  then  yield  the  desired  b^  parameters. 
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e(n) 

Bq(  z) 

x(n) 

Ap(z) 

Ap(z) 

a(n) 


Figure  2.2.  Generation  of  residual  tiaie  series. 


(ii)  If  computational  requirements  are  of  vital  concern,  the  technique 
to  be  now  outlined  is  particularly  efficient  [15], [16].  It  utilizes  the 
Fourier  transform  of  the  causal  part  of  the  autocorrelation  sequence 

CD 

D(«jW)  =  ^  rx(n)e-j«®  (2.32) 

n=l 


The  underlying  power  spectral  density  function  may  be  directly  determined 
from  this  Fourier  transform  according  to 

Sx(ej*>)  ■  rx(0)  +  2Re[D(ej“)l  (2.33) 

A  comparison  of  this  expression  with  relationship  (1.13)  reveals  that  the 
transform  D(ejw)  must  be  of  the  form 

cie"jw+C2e"j2w  +  ...  +  cpe-jPw 

D(e  J  w)  =  - - - 

l+sie~jw  +  ...  +  ape“JPw 


C(e» 

Ap(eJ“) 


(2.34) 


where  we  are  tacitly  assuming  that  the  moving  average  order  is  not  larger 
than  the  autoregressive  order  (i.e.,  q  £  p) . 

To  determine  the  required  cn  coefficients  in  expression  (2.34),  we  will 
first  compute  the  first  s  impulse  response  elements  of  the  filter  H(ejw)  = 
l/Ap(eJw).  This  will  entail  using  the  following  relationship 

P 

h(n)  =  -  ^  akh(n-k)  1  i  n  i  s  (2.35) 

k-1 
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in  which  h(0)  *  1  and  h(n)  *  0  for  n  <  0  arc  used  to  initiate  the  reoaraioa. 
We  next  nte  the  tiae  doaain  equivalency  of  relatioaahip  (2.34)  to  coaelade 
that 


In  general,  the  overdetermined  system  of  equations  (2.36)  will  not  have 
a  solution  unless  the  autocorrelation  elements  rz(n)  are  associated  with  an 
ARMA  process  of  order  (p,p)  or  lower.  Assuming  this  not  to  be  the  case,  we 
could  select  the  vector  c  so  as  to  provide  a  least  squares  solution  to 
expression  (2.36).  This  would  take  the  form  of  solving  the  consistent  system 
of  linear  equations 

c  =  [H*B]-lH*r  (2.37) 

In  order  to  achieve  the  aforementioned  efficient  computational 
algorithm,  the  parameter  s  may  be  taken  to  be  p  which  renders  the  following 
straightforward  method  for  evaluating  the  cn 

n-1 

cn  -  ^  akrx(n-k)  linip  (2.38) 

k=0 

This  is  basically  the  approach  taken  in  references  [IS]  snd  [16].  In  using 
expression  (2.38)  for  evaluating  the  cn,  we  are  trading  off  performance  for 
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computational  efficiency.  It  has  been  the  author's  experience  that  the 
spectral  estimates  achieved  upon  using  the  least  squares  fit  (2.37)  do  not 
typically  provide  a  superior  performance  to  those  given  by  the  simpler 
relationship  (2.38).  In  any  case,  once  the  cn  parameters  have  been 
determined,  the  Fourier  transform  (2.34)  is  used  in  expression  (2.33)  to 
effect  the  required  power  spectral  density  model.  Moreover,  if  it  is  desired 
to  evaluate  the  b^  parameters,  we  can  use  the  identity 

|Bq(ej“)|2  =  Ap(ej“)  C(ej“)+  Ap(ej“)C(ej“)+rx(0) lAp(eJ«) I2  (2.39) 

and  a  spectral  factorization  to  achieve  this  objective. 

In  this  section,  we  have  outlined  convenient  procedures  for  generating 
MA,  AK  and  ARMA  spectral  models  when  perfect  autocorrelation  lag  information 
is  available.  The  principal  steps  of  these  procedures  are  summarized  in 
Table  2.2.  Although  these  results  are  of  primarily  theoretical  interest,  we 
will  subsequently  adapt  them  to  evolve  effective  rational  spectral  estimation 
methods  for  the  more  practical  case  where  only  raw  time  series  observations 
are  used  in  the  modeling. 
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MA  Model 


q 

Sx(ejw)  =  ^  w(n)  rx(n)  e-j"®  (1.6) 

n=-q 

AR  Model 

(i)  Fora  the  (p+l)x(p+l)  AR  autocorrelation  matrix  R  using  expression 

(2.12) 

(ii)  Solve  Ra  =  |b0|^  (2.11) 

where  parameter  bQ  is  selected  so  that  the  first  component  of  a  is 
one. 

2 

bo 

(iii)  Sx(eJW)  =  - - - — 

l+ax  e"Jw  +  ...  +  ape“jp" 

ARMA  Model 

(i)  Form  the  tx(p+l)  ARMA  autocorrelation  matrix  Rj  using  expression 
(2.24) 

(ii)  (a)  If  Rank  (Rj  Rj)  <  p+1  then  solve 

R1*R1  »  *  9  (2.2 3) 

(b)  If  Rank  (Rj.*Ri)  =  p+i  then  either  solve 

R1*R1  s  =  a  ej  (2.28) 

where  a  is  selected  so  that  first  component  of  a  is  one. 
or 

use  the  minimum  eigenvalue-eigenve . .or  yielding  selection 
(2.27)  . 

P  P 

(iii)  rs(n)  ■  ^  ^  ak*m  rx(n+m-k)  0<nlq  (2.32) 

k=0  bfO 

q 

(iv)  Sx(ejw)  = 

Table  2.2.  Rational  spectral  model  techniques  employing 
exact  autocorrelation  lag  information. 


n=-q 
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III.  Sinusoids  in  White  Noise  Example 

The  procedure  as  developed  in  the  preceding  section  is  applicable  to  the 
task  of  generating  rational  models  for  the  general  class  of  wide-sense 
stationary  time  series.  In  order  to  demonstrate  the  relative  effectiveness 
of  MA,  AR,  and  ARMA  modeling,  the  classical  problem  of  the  detection  and 
frequency  identification  of  the  sinusoids  in  white  noise  case  will  now  be 
considered.  Although  this  does  represent  a  very  narrow  application  of 
rational  spectral  estimation  techniques,  it  provides  a  meaningful  basis  for 
understanding  the  relative  performance  capabilities  of  MA,  AR,  ARMA  models. 
In  particular,  the  time  series  being  now  examined  is  taken  to  be  composed  of 
the  sum  of  m  real  sinusoids  in  additive  noise  as  specified  by 


x(n)  =  ^  Ak  sin  [2nfkn  +  ek]  +  w(n) 

k=l 


(3.1) 


in  which  the  ok  are  independent,  uniformly  distributed  random  variables  on 
the  interval  [-it, it]  and  w(n)  is  a  zero  mean,  variance  o^  white  noise  process. 
It  is  recalled  that  the  problem  of  detecting  sinusoids  in  noise  originally 
gave  rise  to  spectral  estimation  theory.  The  periodogram  method  was 
developed  for  this  very  purpose  by  Schuster  in  1898  [58]. 

The  task  at  hand  is  to  generate  MA,  AR,  and,  ARMA  models  from  the 
autocorrelation  values  associated  with  this  time  series  using  the  procetures 
outlined  in  the  previous  section.  It  is  a  simple  matter  to  show  that  the 
autocorrelation  sequence  characterizing  time  series  (3.1)  is  given  by 
m 

2 


rx(n)  =  ^  0.5  Ak  cos  [2nfkn]  +  o^8(n) 

k=l 


(3.2) 


in  which  6(n)  denotes  the  unit-impule  (Kronecker  delta)  sequence.  The  power 

spectral  density  function  associated  with  this  process  is  composed  of  2m 

2 

dirac  delta  impulses  of  amplitude  0.5  A^  located  at  frequencies  ±fk  riding  on 
top  of  a  constant  value  ,  As  such,  this  discontinuous  power  spectral 
density  function  may  not  be  associated  with  a  finite  order  MA,  AR,  or  ARMA 
process. 


Although  the  autocorrelation  sequence  (3.2)  is  not  computable  with  a 
finite  order  ARMA  model,  it  is  readily  shown  that  this  sequence  will  satisfy 
the  following  homogeneous  relationships 

2m 

^  »k  rx(n-k)  =  0  for  n  >  2m  (3.3) 

k=0 

where  a0  =  l.  The  ak  parameters  required  in  this  expression  are  obtained  by 
equating  coefficients  of  the  following  polynomial  equivalency 

2m 

A2m(*)  =  ) 

n=0 

m 

=  (1-2  z"l  cos(2«fk)  +  z~2]  (3.4) 

k=l 

where  the  zeroes  of  this  polynomial  (i.e.,  e±j2nfk)  are  identified  with  the 
frequencies  of  the  time  series'  sinusoids  (e.g.,  see  refs.  [10] , [32] , [55] ) . 

Upon  comparison  of  relationships  (3.3)  and  (2.22),  it  might  be 
incorrectly  inferred  that  the  autocorrelation  sequence  (3.2)  would  be 
associated  with  an  ARMA  process  of  order  (2m, 2m).  Upon  examination  of  the 
Yule-Walker  equations  for  indices  0  £  n  ^  2m,  however,  it  will  be  found  that 
an  exact  correspondence  does  not  result.  This  simply  reflects  the  fact  that 
the  time  series  (3.1)  does  not  arise  from  exciting  a  linear  ARMA  operator 
„  with  white  noise.  Nonetheless,  due  to  the  identical  forms  of  equations 
(2.22)  and  (3.3),  we  may  still  use  the  ARMA  modeling  autoregressive  parameter 
procedure  as  outlined  in  Section  II  to  identify  the  2m  parameters  a^.  These 
parameters  would  be  then  in  turn  inserted  into  relationship  (3.4)  to  identify 
the  frequency  parameters  f^  upon  factorization  of  the  polynomial  A2m(*)> 
This  spectral  behavior  can  be  conveniently  displayed  in  a  plot  of 
|l/A2a(eJw)l  versus  w. 

Once  the  f^  frequency  parameters  have  been  determined,  the  associated  A^ 
amplitude  parameters  may  be  obtained  upon  evaluating  expression  (3.2)  over 
any  set  of  m  or  more  indices  satisfying  n  2  With  this  in  mind,  let  us 
evaluate  this  expression  for  the  contiguous  indices  1  £  n  £  v  where  the 
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integer  v  2.  This  is  found  to  yield  the  following  overdeterained  (if  v)a) 
system  of  consistent  linear  equations  in  the  A^  unknowns 


rx(D 

rx(2) 

e 

= 

cos(2nfj) 

cos(4nfj) 

• 

cos(2nf2) 

cos(4nf2) 

•  •  • 

•  «  • 

cos(2nfB) 

cos (4nfm) 

e 

_rx(v) 

• 

• 

cos(2vnfj) 

cos(2vnf2) 

e  •  e 

cos(2nvfm) 

or  equivalently  as 


2  • 
Al/2 
2 

a2/2 


2 

A*/ 2 


(3.5) 


£  =  Cp  (3.6) 

2 

where  £  is  the  so-called  mxl  power  vector  with  elements  A^/ 2 .  If  the  integer 
paraaeter  v  is  selected  to  be  larger  than  or  equal  to  m,  the  least  square 
approxiaate  solution  to  the  overdetermining  equations  (3.6)  is  given  by 

p  =  [C'C]"1  C'  r  (3.7) 

where  C*  designates  the  transpose  of  matrix  C.  In  the  case  of  perfect 
autocorrelation  knowledge,  we  normally  set  v  =  m  thereby  giving  the  solution 
£  *  C”lj:.  In  the  more  practical  case  in  which  only  raw  time  series 
observations  are  given  for  the  estimate,  however,  a  desirable  degree  of 
paraaeter  smoothing  is  achieved  by  selecting  v  >  m. 

Although  the  sinusoids  in  white  noise  time  series  (3.1)  is  not 
compatible  with  an  AR  model,  AR  models  have  also  been  successfully  employed 
in  analyzing  such  time  series.  Depending  on  the  underlying  signal  to  noise 
ratios 


2«2 


1  <  k  i  m 


the  desired  detection  and  frequency  estimation  will  require  that  the  AR  order 
paraaeter  p  be  aade  sianif icatlv  larger  than  2m.  Variants  of  the  Pisarenko 
method  [55].  and,  the  SVD  approach  of  Tufts  and  Kumaresan  [42], [61]  typically 
produce  satisfactory  performance  on  the  sinusoids  in  white  noise  case.  As  we 
will  illustrate  in  Section  VIII,  the  approach  taken  in  this  paper  will  also 
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produce  exceptional  performance  when  an  SVD  adaption  of  the  ABJiA  modeling 
method  herein  presented  is  made. 


Alternate  Method 

It  is  possible  to  apply  the  concept  of  using  an  overdetermined  system  of 
model  evaluations  for  achieving  high  quality  alternative  estimates  for  the 
frequency  parameters  appearing  in  expression  (3.1).  This  will  make  use  of 
the  observation  that  homogeneous  relationship  (3.3)  holds  for  all  values  of  n 
provided  that  there  is  no  white  noise  present  (i.e.,  o 2  =  0).  Under  this 
restriction,  an  evaluation  of  expression  (3.2)  with  ®2  =  o  over  the  indices 
“t  +  2p  <_  n  t  (in  which  p  =  2m)  is  found  to  result  in  the  following 
symmetrical  relationship 


rx(-t+p)  rx(-t+p-l)  .  .  .  rx(-t) 

i 

ro 

rx(-t+p+l)  rx(-t+p)  .  .  .  rx(-t+l) 

*i 

0 

•  •  * 

e 

• 

e 

= 

e  •  i 

e 

e 

* 

e 

*P  _ 

• 

*  *  e 

rx(t)  rx(t-l)  .  .  .  rx(t-p) 

0 

or 

Rs  A  =  «  (3.8b) 

in  which  t  is  selected  so  that  t  >  3m/ 2  thereby  ensuring  an  overdetermined 
system  of  homogeneous  relationships. 

If  the  autocorrelation  lag  entries  of  expression  (3.8)  correspond  to 
(3.2)  with  =  o,  it  then  follows  that  the  overdetermined  system  of 

equations  (3.8)  will  have  a  unique  solution  for  the  a^  coefficients.  This 
solution  can  then  be  incorporated  into  equation  (3.4)  to  obtain  estimates  for 
the  frequency  f^  parameters.  In  the  additive  noise  case  <r^  ft  0.  however, 
this  system  of  equations  will  generally  not  have  a  solution.  Since  the  <y2 
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tern  appears  in  only  p+1  ont  of  the  (2t-p+l)x(p+l)  entries  of  matrix  Rg 

(i.e.,  the  rx(0)  entries),  it  can  be  argued  that  so  long  as  t>>p,  the  effect 

of  the  additive  noise  will  be  minimal.  Based  on  this  premise,  it  is  natural 

to  then  seek  a  vector  a  such  that  this  inconsistent  system  of  linear 

equations  is  best  satisfied  in  a  least  squares  sense.  The  required  least 
squares  solution  is  then  given  by  solving  the  system  of  equations 

Rs*  W  Rs  a  =  a  ej  (3.9) 

in  which  a  is  a  normalizing  scalar  selected  to  ensure  that  the  first 
component  of  a  is  one.  The  nonnegative  diagonal  matrix  V  is  typically 
selected  to  be  equal  to  the  identity  matrix.  As  we  will  see  in  Section  VIII, 
the  solutions  obtained  by  using  expression  (3.9)  often  provide  exceptional 
estimates  so  long  as  t>>p.  A  paper  in  preparation  will  further  refine  this 
new  approach. 


Numerical  Example 

In  order  to  illustrate  the  effectiveness  of  the  three  rational  models  in 
resolving  sinusoids  embedded  in  white  noise,  we  shall  now  consider  the 
specific  time  series 

x(n)  =  sin(0.4nn)  +  sin(0.43nn)  +  w(n)  (3.10) 

The  white  noise  series  (w(n)}  will  be  taken  to  have  a  variance  of  0.5  thereby 
creating  a  zero  dB  signal-to-noi se  ratio  (SNR)  environment.  According  to 
relationship  (3.2),  the  autocorrelation  sequence  associated  with  this  time 
series  is  specified  by 

rx(n'  =  0.5  cos(0.4un)  +  0.5  cos(0.43nn)  +  0.56(n)  (3.11) 

We  shall  now  use  these  autocorrelation  lags  along  with  the  concepts  developed 
in  Section  II  to  generate  appropr.  *  MA,  AR  and  ARMA  models.  A  brief 
discussion  of  the  resultant  modeling  performances  in  this  idealistic 
situation  will  now  be  given. 

MA  Models:  When  using  the  classical  spectral  modeling  expression 
4 

Sx(eJ«)  =  ^  rx<*»)  «-jwn  (3.12) 

n=-q 
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we  are  in  effect  invoking  a  MA(q)  model.  Plots  of  this  expression  with 
entries  (3.11)  for  model  order  selections  of  q  =  32  and  q  =  64  are  shown  in 
Figure  3.1  over  the  range  of  normalized  frequencies  0  £  f  ^  0.5.  From  these 
results,  it  is  apparent  that  a  resolution  of  the  two  equal  amplitude 
sinusoids  was  not  achieved  for  a  thirty  second  order  HA  model,  but,  was 
achieved  for  a  sixty  fourth  order  MA  model.  Thus,  an  artificially  high  order 
MA  model  was  required  in  order  to  resolve  the  two  sinusoids  when  exact 
autocorrelation  lags  were  used.  This  example  nicely  demonstrates  the 
distortions  which  can  result  when  invoking  a  MA  model  i*-  the  underlying 
assumption  that  rx(n)  =  0  for  n  >  q  thereby  implied  is  not  satisfied  (or 
approximately  satisfied).  Clearly,  the  nondamped  nature  of  the 

autocorrelation  sequence  (3.2)  behavior  indicates  that  the  MA  modeling  of  a 
time  series  composed  of  sinusoids  in  white  noise  can  be  inappropriate  unless 
a  sufficiently  large  selection  of  the  MA  model  order  q  is  made. 

AR  Models:  We  next  used  the  same  autocorrelation  lag  information  (3.11) 
to  generate  AR  models  of  order  p  =  20  and  p  =  24  using  expression  (2.11). 
The  resultant  spectral  estimates  1/ lAp(eju>)  are  shown  in  Figure  3.2a  and  b 
for  these  two  model  order  choices.  It  is  apparent  that  the  twentieth  order 
model  was  unable  to  resolve  the  two  sinusoids  while  the  twenty-fourth  was 
just  able  to  achieve  the  resolution.  Since  the  specific  autocorrelation  lags 

rx(n)  for  0  <.  n  £  p  were  required  for  generating  an  AR(p)  model,  it  is 
apparent  that  fewer  autocorrelation  lags  were  needed  to  resolve  the  two 
sinusoids  when  using  an  AR(24)  model  in  comparison  to  the  MA  model.  This 
simply  gives  credance  to  the  previously  made  suggestion  that  AR  models 
provide  a  more  effective  instrument  for  representing  peak  like  spectra  than 
are  MA  models. 

In  order  to  illustrate  the  effect  of  using  more  than  the  minimal  number 
of  extended  Yule-Walker  equations  (i.e.,t  >  p)  when  generating  an  AR  model, 
we  next  used  the  ARMA  modeling  equations  (2.23)  with  parameters  p=10,  q=0, 
and  t=100.  The  AR(10)  model  which  results  upon  solving  equations  (2.23)  for 
this  choice  of  order  parameters  has  a  spectral  behavior  as  depicted  in  Figure 
3.2c.  This  AR(10)  spectral  estimate  is  seen  to  be  significantly  better  than 
that  achieved  by  the  higher  order  AR(24)  estimate.  Clearly,  the  process  of 
using  100  (i.e.,  t=100)  extended  Yule-Walker  equation  evaluations  instead  of 
the  minimal  number  10  has  produced  this  significant  improvement.  This 
improvement  is  due  to  the  fact  that  only  the  first  four  of  the  one  hundred 


extended  Yule-Walker  equation  evaluations  are  In  error  due  to  the  Inposition 
of  an  inproper  AR  nodel  (see  equation  (3.3)).  By  increasing  t  beyond  p.  the 
effect  has  been  to  dilute  the  negative  impact  of  the  erroneous  first  four 
Yule-Walker  equations  on  the  model  parameters  (i.e.,  four  improper  equations 
and  96  appropriate  equations).  The  reader  is  urged  to  fully  understand  the 
implications  of  this  result  in  a  more  broadly  based  context. 

ARMA  Nodel:  We  next  used  the  given  autocorrelation  lag  information 

(3.11)  to  generate  an  ARMA  model  of  order  p  =  4  by  appealing  to  expression 

(2.23).  We  here  select  the  variable  t  to  be  equal  to  its  minimal  value  of 
four,  and,  in  accordance  with  this  section's  discussion  take  q  =  4 .  The 
resultant  ARMA  based  spectral  model  1/  I A4 ( e j **>)  I ^  without  the  MA  component  is 
plotted  in  Figure  3.3.  The  two  sinusoids  are  nicely  resolved  and  when  the 
fourth  order  polynomial  A^eJ4*)  was  factored,  it  was  found  to  have  its  four 
roots  on  the  unit  circle  at  eij^nfjt  for  k  =  1,2  in  which  fj  =  0.2  and  f2  = 
0.215.  This  should  not  be  surprising  since  it  was  previously  shown  in  this 
section  that  an  ARMA  type  model  is  perfectly  compatible  with  a  sinusoids  in 
white  noise  time  series  (MA  and  AR  models  are  not  compatible).  It  is 
noteworthy  that  only  the  autocorrelation  lags  rx(n)  for  1  <.  n  £  8  were 

required  in  generating  the  spectral  model  depicted  in  Figure  3.3. 

Alternative  Method:  As  a  final  procedure,  we  used  the  alternative 

method  as  represented  by  relationship  (3.9)  in  which  the  parameters  were 
taken  to  be  p=10  and  t=50 .  Using  these  parameters  along  with  the  theoretical 
autocorrelation  lag  entries  (3.11)  a  plot  of  the  resultant  estimate 
1/ lAio(ej“)  |2  is  shown  in  Figure  3.4.  The  two  sinusoids  are  resolved  with 
well  defined  peaks,  and,  the  spectral  estimates  are  superior  to  those 
achieved  by  the  MA  and  AR  model  results  but  inferior  to  the  ARMA  model. 
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Fig.  3.3  Autoregressive-moving  average 
(ARMA)  spectral  models  using 
expression  (2.23)  with  exact 
autocorrelation  lags  and  p=t=4. 
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IV.  MA  Modeling  -  Time  Series  Observations 

From  a  practical  viewpoint,  the  situation  in  which  exact  autocorrelation 
lag  values  are  given  for  effecting  a  spectral  estimate  almost  never  arises. 
More  typically,  the  required  spectral  estimate  is  to  be  generated  from  a 
finite  set  of  contiguous  time  series  observations  as  represented  by 

x ( 1 ) ,  x(2) .  x(N)  (4.1) 

In  this  section,  we  will  be  concerned  with  achieving  MA  spectral  estimates 
from  this  observation  set.  The  methods  to  be  presented  for  this  purpose  are 
largely  influenced  by  the  theoretical  developments  found  in  Section  II. 

There  exist  two  primary  MA  spectral  estimation  procedures  that  have 
found  favor  among  users.  They  are  indi  ect  methods  based  on  autocorrelation 
estimates  such  as  proposed  by  Blackman  and  Tukey  [8],  and,  direct  methods 
based  on  the  Fourier  transform  of  the  time  series  observations  and  widely 
known  as  the  periodogram  (or  the  method  of  averaged  periodograms  due  to  Welch 
[64]).  As  we  will  shortly  see,  the  periodogram  is  a  special  case  of  the 
Blackman-Tukey  approach. 


Blackman-Tukev  Approach 

In  the  Blackman-Tukey  method,  one  first  obtains  autocorrelation 
estimates  rx(n)  from  the  given  observation  set  (4.1).  These  autocorrelation 
estimates  are  then  inserted  into  expression  (1.2)  to  effect  the  required 
spectral  estimate.  For  a  variety  of  reasons,  it  is  often  beneficial  to 
introduce  a  windowing  sequence  w(n)  to  achieve  the  windowed  MA  spectral 
estimate  of  order  q 

q 

S(ej»)  =  ^  w(n)  rjtnje-!"11  (4.2) 
n=-q 

Considerations  to  be  made  in  selecting  the  window  sequence  are  well 
documented  and  the  reader  is  referred  to  references  [33 ] , [ 50] , [57]  .  Two  of 
the  more  popular  selections  are  the  rectangular  window  (i.e.,  w(n)  =  1)  and 
the  Bartlett  triangle  window  (i.e.,  w(n)  -  ( 1- In  I ) / ( q+1) ) . 

The  standard  unbiased  and  biased  autocorrelation  estimates  are  among  the 
most  popular  candidates  to  be  used  in  the  spectral  estimate  (4.2)  (e.g.,  see 
ref.  [33]  for  a  detailed  development).  The  unbiased  estimate  achieves  the 
required  autocorrelation  lag  estimate  according  to 
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-qi  n  £  q 


(4.3) 


rx(n) 


N-  In  | 


N 

^  x(k+n)x(k) 
k=l 


where  the  convention  of  setting  to  zero  any  term  x(n)  in  the  rnaind  for 
which  n  e  [1,N]  is  adopted.  It  is  a  simple  matter  to  show  that  E{rz(n)}  = 
rz(n)  thereby  establishing  the  unbiased  nature  of  estimate  (4.3).  Moreover, 
this  unbiased  estimate  is  also  consistent  so  long  as  the  order  parameter  q  is 
finite. 

Notwithstanding  the  obviously  attractive  statistical  properties 
possessed  by  the  unbiased  estimate  (4.3),  a  number  of  prominent  statisticians 
have  proposed  using  the  standard  biased  estimate  (e.g.,  see  refs. 
[33], (52], [53]. 

N 

rz(n)  =  £  ^  x(k+n)x(k)  -«  i  n  i  q  (4.4) 

k=l 

We  again  adhere  to  the  convention  of  setting  to  zero  any  term  x(n)  in  the 
summand  for  which  n  e  [1,N],  The  justification  for  using  the  biased  estimate 
is  that  it  is  more  stable  statistically.  It  must  be  noted,  however,  that  the 
relative  advantages  of  unbiased  vs.  biased  estimators  remains  an  unsettled 
issue.  With  this  in  mind,  the  user  is  cautioned  to  base  his  ultimate 
selection  on  the  particular  application  being  considered.  This  will 
undoubtably  entail  a  great  deal  of  empirically  based  experimentation  on  the 
users  part. 


Periodoaram 

In  the  periodogram  method,  the  required  spectral  estimate  is  given  by 
the  expression 


Sz(eJ“) 


jf  IXN(ej«)|2 


(4.5) 


where  l^(ej<0)  is  the  Fourier  transform  of  the  time  series  observations,  that 
is 
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x(n+l)e-jo« 


(4.6) 


N-l 

XN(eJ«>)  =  } 
n=0 

We  here  use  the  subscript  N  on  %(ej“)  to  explicitly  denote  its  dependency  on 
the  observation  length  parameter.  It  is  readily  shown  that  the  periodogram 
is  identical  to  the  Blackman-Tukey  approach  when  the  biased  estimates  (4.4) 
are  used  in  expression  (4.2)  with  q  =  N-l  and  w(n)  =  1. 

The  primary  advantage  in  using  the  Periodogram  approach  is  computational 
in  nature.  Specifically,  the  values  of  the  periodogram  at  the  N  discrete  set 
of  uniformly  spaced  radian  frequencies  u ^  =  2nk/N  for 
0  i  k  i  N-l  is  seen  to  entail  evaluation  of  the  entities 

N-l 

.  2»rk  V  2nkn 

XN(eJlT)'’  l  x(n+l)e  J  “ IT  .  0  £  k  £  N-l  (4.7) 

n=0 

These  evaluations  are  readily  carried  out  by  use  of  the  N  point  fast  Fourier 
transform  (FFT)  algorithm  (e.g.,  see  refs.  [50], [57]).  With  the  FFT 
algorithm,  the  N  quantities  (4.7)  may  be  computed  in  which  the  required 
number  of  complex  additions  and  multiplications  is  on  the  order  of  N  log^N. 
The  computational  savings  accrued  in  using  the  FFT  algorithm  for  spectral 
estimates  is  considerable  when  it  is  realized  that  a  direct  evaluation  of 
expression  (4.7)  is  seen  to  entail  N^  complex  additions  and  multiplications. 
Due  to  the  computational  savings  accrued  in  using  the  FFT  implementation  of 
the  periodogram,  spectral  estimates  of  long  data  sequences  became  feasible 
with  the  FFT' s  development. 

Although  the  FFT  algorithm  offers  a  computationally  efficient  means  for 
numerically  evaluating  the  periodogram  (4.5),  it  possesses  a  potentially 
serious  drawback.  Specifically,  as  just  suggested,  this  FFT  implementation 
provides  a  sampled  version  of  the  periodogram  in  which  the  frequency  samples 
are  separated  by  2n/N  radians.  For  many  applications  of  interest,  this 
sampling  may  be  too  coarse  in  that  the  detailed  continuous  frequency  behavior 
of  the  periodogram  (4.5)  may  be  somewhat  obscured  through  the  sampling 
process.  An  example  of  this  will  be  given  in  Section  VIII.  In  order  to 
alleviate  this  potential  difficulty,  we  may  apply  the  concept  of  zero 
naddina.  This  simply  entails  the  appending  of  L  zeroes  to  the  given  set  of 
time  series  observations,  that  is 


40 


x(l),  x(2) . 


x(N) ,  0,  0,  ... , 


0 


(4.8) 


L  zeroes 

where  L  is  s  yet  unspecified  positive  integer.  If  we  were  to  take  the 
Fourier  transform  of  this  padded  time  series,  we  would  obtain  the  same 
transform  (4.6)  and  the  same  periodogram  function  (4.5).  On  the  other  hand, 
if  we  were  to  take  a  N+L  point  FFT  of  this  padded  time  series,  the  following 
more  finely  spaced  samples  of  the  Fourier  transform  would  be  generated 

N-l 

•  2nk  ^  .  _ »  2nkn 

v  <«J  is?  ■  \  I(  <4-,) 

n=0 

If  these  sampled  values  were  then  substituted  into  expression  (4.5) ,  we  would 
obtain  sampled  values  of  the  periodogram  at  the  more  finely  spaced 
frequencies  =  2n/(N+L)  for  0£k<N+L.  The  effect  of  the  L  zero  padding  is 
then  seen  to  result  in  a  reduction  of  the  frequency  sampling  interval  from 
2n/N  to  2rr/(N+L).  By  selecting  L  suitably  large,  we  can  reduce  this  sampling 
interval  to  any  degree  desirable. 

One  should  not  gain  the  mistaken  impression  that  padding  will  enable  us 
to  achieve  any  degree  of  frequency  resolution  desired.  The  fundamental 
unsampled  periodogram  (4.5)  has  an  inherent  frequency  resolution  capability 
of  Aw  =  2n/N  (or  equivalently  Af  *  1/N) .  When  using  a  N  point  FFT 
implementation  of  the  periodogram,  however,  it  is  entirely  possible  that 
spectral  peaks  may  lie  between  the  sampled  frequencies  =  2nk/N.  In  such 
cases,  the  peaks  effect  on  the  sampled  periodogram  may  be  seriously  diluted 
even  though  it  would  be  clearly  evident  in  the  unaampled  periodogram.  Upon 
padding  with  L  zeroes,  we  can  remove  the  ambiguity  caused  by  this  sampling 
process  and  still  retain  the  computational  efficiency  of  an  FFT 
impl ementation. 


V.  AS  Modeling  -  Tine  Series  Observations 

The  task  of  generating  AS  spectral  models  from  a  set  of  time  series 
observations  has  been  of  primary  concern  to  many  investigators  over  the  last 
few  years.  Undoubtedly,  the  most  widely,  used  AS  modeling  procedure  is  the 
Burg  algorithm  as  first  proposed  in  1967  [13],  This  algorithm  not  only 
provided  a  spectral  estimation  capability  that  was  theretofore  lacking,  it 
also  inspired  an  intense  search  for  improved  rational  spectral  estimation 
procedures.  Much  of  contemporary  spectral  estimation  theory  has  been 
directly  influenced  by  the  philosophy  contained  within  the  Burg  approach.  As 
a  matter  of  fact,  many  of  the  more  recent  rational  estimation  procedures  were 
developed  so  as  to  overcome  some  of  the  deficiencies  observed  in  the  Burg 
algorithm  as  typified  by  line  splitting  and  biased  frequency  estimates. 
Nonetheless,  the  Burg  algorithm  still  occupies  the  pre-eminent  position  among 
contemporary  AR  modeling  methods.  Since  its  operational  behavior  is  so  well 
documented,  we  refer  the  interested  reader  to  the  relevant  literature  for  its 
detailed  development  (e.g.,  see  refs.  [23], [31]). 

In  this  section  and  section  IX,  we  will  demonstrate  that  many  of  the 
popularly  used  AR  methods  (which  includes  the  Burg  algorithm)  may  be 
interpreted  as  providing  statistical  estimates  of  the  fundamental  Yule-Walker 
equations  (2.11)  that  govern  AR  processes.  These  estimates  are  to  be 
obtained  from  the  set  of  contiguous  time  series  observations 

x(l)  ,  x(2),  ....  jc(N)  (5.1) 

which  are  made  available  through  some  measurement  mechanism.  More 
specifically,  it  is  well  known  that  various  contemporary  methods  either 
explicitly  or  implicitly  use  these  observations  to  generate  estimates  of  the 
(p+l)x(p+l)  autocorrelation  matrix  R  which  appears  in  the  fundamental 
relationship  (2.11).  Clearly,  the  elements  of  the  matrix  estimate  R  must  be 
such  that 


R(i,j)  is  an  estimate  of  rx(i-j)  for  1  £  i,j  £  p+1  (5.2) 

Once  these  estimates  have  been  computed  from  the  given  time  series 
observations,  the  resultant  autoregressive  parameter  vector  estimate  is,  in 
accordance  with  expression  (2.11),  obtained  by  solving  the  linear  system  of 
equations. 

R  1  =  lb0|2  e2  (5.3) 
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in  which  the  normalizing  parameter  b0  is  selected  so  that  the  first  component 
of  a  is  one.  The  steps  of  this  general  AS  modeling  approach  are  suaunarized 
in  Table  5.1. 


Step  1:  Compote  Estimates  of  R(i,j)  =  rx(i-j)  for  1  £i»j»ip+l  to  form 
the  (p+l)x(p+l)  antocorrelation  matrix  estimate  S. 


Step  2:  Solve  the  linear  system  of  equations  Ra  =  lb0|2  ej  in  which  the 
normalizing  coefficient  b0  is  selected  so  that  the  first 
component  of  .a  is  one. 

Step  3:  The  requii 

SAR(ej«)  = 

red  AR(p)  spectral  estimate  is  then  specified  by 

2 

b° 

1  +  e-j«>  +  ...  +  ap  e"JP« 

Table  5.1  Basic  steps  in  obtaining  an  AR(p)  spectral  estimate. 


The  quality  of  the  AS  modeling  approach  as  embodied  in  expression  (5.3) 
is  critically  dependent  on  the  choice  of  the  autocorrelation  lag  estimation 
procedure  used.  For  many  applications,  the  standard  unbiased  autocorrelation 
estimates  as  given  by 

R(i,j)  =  - 1 -  ^  x(k+i-j)x(k)  1  i  1  i  P+1  (5.4) 

N-li-jl  1  i  j  i  P+1 

typically  provides  the  best  selection  in  terms  of  spectral  estimation 
performance.  It  is  seen  that  the  autocorrelation  matrix  formed  from  this  set 
of  estimates  will  be  Toeplitz  and  symmetric;  properties  shared  by  the  actual 
autocorrelation  matrix  being  approximated.  Moreover,  this  estimate  is 

A 

consistent  in  the  sense  that  as  N  approaches  infinity,  we  have  R  — ^  R  under 
the  second  order  ergodic  assumption  on  the  time  series.  In  view  of  all  of 
these  favorable  qualities,  it  is  not  surprising  that  the  standard  unbiased 
estimator  (5.4)  generally  provides  excellent  AR  modeling  performance.  In 
Section  IX,  some  of  the  more  popularly  used  adaptive  methods  of  AR  spectral 
estimation  will  also  be  studied. 
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VI.  ASMA  Modeling:  Time  Series  Observations 

The  methods  for  generating  ASMA  models  based  upon  times  series 
observations  fall  into  basically  tvo  categories:  the  a^  and  bfc  parameters 
are  either  evaluated  (i)  simultaneously  or  (ii)  separately.  In  the  first 
category,  maximum  likelihood  based  techniques  form  one  of  the  most  widely 
used  of  such  methods.  These  include  exact  maximum  likelihood  approaches 
(e.g.,  refs  [6]  and  [48]),  and,  least  square  methods  which  approximate  the 
exact  likelihood  function  (e.g.,  refs  [3],  [9],  [29]).  Although  offering  the 
promise  of  optimum  modeling,  these  maximum  likelihood  methods  entail  the 
application  of  nonlinear  programming  solution  procedures.  As  such,  these 
solution  procedures  are  computationally  inefficient,  and,  they  suffer  the 
obvious  drawbacks  characteristic  of  nonlinear  programming  methods.  Other 
nonmaximum  likelihood  methods  which  fall  into  category  ( i)  have  been  proposed 
(e.g.,  see  refs.  [30] , [40] , [60] ) .  These  methods  also  entail  the  utilization 
of  nonlinear  programming  solution  procedures. 

In  recognition  of  the  obvious  shortcomings  of  nonlinear  programming 
based  techniques,  a  number  of  methods  have  been  proposed  which  employ  a 
separate  evaluation  of  the  AR  and  MA  parameters.  By  using  this  approach,  it 
is  generally  possible  to  obtain  satisfactory  modeling  while  not  incurring  the 
drawbacks  of  a  nonlinear  programming  solution  procedure.  These  techniques 
typically  entailed  using  the  first  p  extended  Yule-Walker  equations  to  obtain 
estimates,  in  a  linear  fashion,  for  the  AR  parameters  (e.g.,  see  refs 
[26] , [28] , [35] , [38] ) .  Unfortunately,  the  utilization  of  the  minimal  number 
of  extended  Yule-Walker  equations  (i.e.,  p)  gave  rise  to  an  undesirable 
parameter  hypersensitivity.  In  recognition  of  this  fact,  a  procedure  for 
using  an  overdetermined  set  of  Yule-Walker  equation  evaluations  to  decrease 
this  hypersensitivity  was  proposed  [15].  This  approach  has  since  been 
adopted  by  other  researchers  in  spectral  estimation  applications  with  success 
(e.g.,  see  refs.  [7] , [12] , [36] , [51] ) .  With  this  in  mind,  we  shall  now  give  a 
detailed  development  of  the  overdetermined  approach  to  estimating  the  AS 
parameters  of  an  ASMA  model. 


AS  Parameter  Estimation 

Although  the  procedure  presented  in  Section  II  for  generating  ASMA 
models  is  attractive,  one  is  rarely  provided  with  exact  autocorrelation 
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information.  The  more  common  situation  it  one  in  which  the  only  available 
information  takes  the  form  of  a  finite  set  of  time  series  observations 

x(l),  *<2) . x(N)  (6.1) 

The  task  at  hand  is  to  then  use  these  time  series  observations  to  estimate 
the  parameters  of  a  postulated  ARMA  model.  In  this  parameter  estimation,  we 
shall  seek  to  incorporate  the  philosophy  as  embodied  in  the  extended 
Tale-Walker  ARMA  model  equations  (2.23)  for  estimating  the  model's  a^ 
parameters. 

This  will  effectively  entail  using  the  given  time  series  observations  to 
generate  an  estimate  of  the  tx(p+l)  autocorrelation  matrix  Rj  which  appeara 
in  expression  (2.24).  Namely,  using  any  of  a  number  of  available  procedurea, 
we  first  compute  the  following  autocorrelation  lag  estimates 

A 

Rl(i.j)  =  an  estimate  of  rx(q+l+i-j)  1  U  i  t  (6.2) 

1  1  j  i  P+1 

Two  particularly  attractive  procedures  for  effecting  these  autocorrelation 
estimates  will  be  detailed  at  the  end  of  this  section  and  in  Section  X. 
Independent  of  what  procedure  is  eventually  used,  the  net  result  of  this 
first  step  will  be  the  generation  of  a  tx(p+l)  autocorrelation  matrix 

A 

estimate  Rj .  Due  to  errors  inherent  in  the  autocorrelation  estimation 
process,  however,  this  matrix  estimate  will  generally  have  fnll  rank  (i.e., 
min  (p+l,t))  instead  of  the  theoretical  rank  p  which  is  possessed  by  the 
matrix  Rj  being  estimated.  This  being  the  case,  it  is  therefore  not 
generally  possible  to  find  an  autoregressive  parameter  vector  with  first 
component  equal  to  one  which  will  satisfy  the  theoretical  relationship  R^a  » 
«  as  given  in  equation  (2.23).  As  such,  the  txl  extended  Ynle-Walker 
equation  error  vector  as  specified  by 

A 

1  *  Rl*  (6 .3) 

will  be  generated. 

A  little  thought  will  convince  oneself  that  the  elements  of  this  error 
vector  will  be  composed  of  a  sum  of  many  random  variable  products  (i.e., 
x(k+m)X(m) )  used  in  formulating  the  autocorrelation  lag  estimates. 
Consequently,  an  assumption  that  the  error  vector  elements  tend  to  be 
Gaussianly  distributed  is  a  reasonable  one.  The  joint  density  function  of 
the  extended  Tule-Walker  equation  error  vector  may  be  therefore  approximated 
by 
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(6.4) 


lwl1/2 

p(e)  =  _  e-0.5(e*We) 


( In ) 


t/2 


in  which  W~1  =  E ( e  e ♦)  designates  the  error  covariance  matrix  which  is 
generally  unknown  and  where  the  expected  value  of  jb  is  taken  to  he  zero. 

With  the  availability  of  the  error  joint  density  function  (6.4).  it  is 
now  possible  to  apply  the  maximum- likelihood  concept  for  estimating  the 
autoregressive  parameters.  Namely,  making  use  of  relationship  (6.3)  and  the 
joint  density  function  (6.4),  it  is  possible  to  generate  a  joint  density 
function  for  the  autorgressive  parameter  vector  a  which  will  be  of  form 
p(a)  =  ye_0 .5 ( a*R*WRa) 

We  now  seek  that  vector  a  which  maximizes  this  joint  density  function  subject 
to  the  constraint  that  the  first  component  of  a  be  one.  Ignoring  the  effect 
of  the  multiplicative  term  y,  the  psuedo  maximum-likelihood  selection  for  a 
then  corresponds  to  solving  the  following  constrained  minimization  problem 

min  a*R*WRa  (6.5) 

a(l)=l 

Using  standard  Lagrange  multiplier  techniques,  the  solution  to  this 
constrained  minimization  problem  is  obtained  by  solving  the  following  system 
of  (p+l)x(p+l)  linear  equations 

R*1  W  Rj  a  =  o  ei  (6.6) 

where  a  is  a  normalizing  constant  selected  so  that  the  first  component  of 
j»®  is  one.l  Expression  (6.6)  constitutes  the  so-called  hiah  performance 
method  of  autoregressive  parameter  selection  [153  — C 201 . 

It  is  to  be  noted  that  in  minimizing  functional  (6.5)  with  respect  to 
the  normalization  constraint  imposed  on  a's  first  component,  the  error  vector 
is  being  minimized  in  the  least  squares  sense.  In  effect,  we  are  then 
selecting  a  so  as  to  best  satisfy  the  theoretical  relationship  (2.23)  given 
by  ^lA  =  A*  Using  this  interpretation,  the  positive  definite  matrix  W  can  be 


1  *  „  a 

In  those  rare  cases  where  the  (p+l)x(p+l)  matrix  R^nfR^  is  singular,  the 

autoregressive  parameter  vector  will  correspond  to  a  suitable  normalized 
eigenvector  associated  with  a  zero  eigenvalue  of  this  matrix. 
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alternatively  thought  of  as  providing  a  weighting  (instead  of  being  an 
unknown  covariance  matrix  inverse)  in  the  error  functional  (6.5).  It  is 
therefore  logical  to  take  W  to  be  a  diagonal  matrix  whose  nonnegative 
diagonal  entries  w^  for  k  =  1,2,  ....  t  provide  a  mechanism  for  weighting  in 
any  desirable  fashion  the  various  extended  Yule-Walker  equation 
approximations  appearing  in  (6.3).  The  uniform  weighting  selection 

W  =  I.  (6.7) 

where  I  is  the  txt  identity  matrix  has  been  found  to  provide  excellent 
modeling  performance  when  the  matrix  estimate  Rj  is  unbiased. 

A  few  words  are  now  appropriate  concerning  the  selection  of  the  integer 
t  which  specifies  the  number  of  extended  Yule-Walker  equations  that  are  being 
approximated.  When  t  is  set  equal  to  its  minimal  value  p,  the  approach  here 
taken  bears  a  close  resemblance  to  various  other  ARMA  modeling  schemes  (e.g., 
see  refs.  [26] ,[28] ,[35] ,[38] ) .  In  this  case,  the  minimal  number  of  p  error 
contaminated  extended  Yule-Walker  equation  evaluations  are  being  used  in 
fixing  the  model's  p  autoregressive  coefficients.  A  little  thought  should 
convince  oneself  of  the  potential  parameter  hypersensitivity  which  can  arise 
in  this  situation.  To  illustrate  this  point,  let  us  briefly  consider  the 
task  of  finding  a  line  which  'best'  fits  a  set  of  error  contaminated 
two-tuples  (xfc.yfc).  Although  only  two  two-tuples  are  needed  to  fix  the 
line's  two  parameters  (i.e.,  its  slope  and  y  intercept),  it  will  be  generally 
more  desirable  to  fix  these  parameters  by  using  more  than  this  minimal  number 
of  two-tuples  thereby  obtaining  a  more  'representative  linear  fit'.  This 
will  entail  finding  the  'best  least  squares  linear  fit'.  The  benefits 
generally  accrued  in  using  this  overdetermined  approach  are  demonstrated  in 
Figure  6.1. 

With  the  above  in  mind,  the  real  advantage  of  this  paper's  approach  is 
achieved  when  the  integer  t  is  selected  to  be  larger  than  p.  In  this  case, 
more  than  the  minimal  number  of  extended  Yule-Walker  equation  evaluations 
(i.e.,  t  instead  of  p)  are  being  used  in  fixing  the  model's  p  autoregressive 
coefficients.  It  is  then  not  surprising  that  a  desirable  decrease  in 
parameter  hypersensitivity  is  generally  realized  upon  selecting  t  >  p.  An 
indication  of  the  benefit  accrued  by  selecting  t  >  p  was  illustrated  in 
Section  III  for  the  case  of  AR  modeling  with  perfect  autocorrelation  lag 
values.  A  similar  advantage  will  be  demonstrated  in  Section  VIII  when  ARMA 
models  are  generated  from  raw  time  series  observations.  In  the  situation 


being  considered  here,  the  integer  parameter  t  is  typically  selected  to  lie 
within  the  range 

P  i  t  i  N-q-1  (6.8) 

with  generally  larger  values  than  the  minimum  p  being  preferred  for  modeling 
fidelity. 

From  an  overall  modeling  viewpoint,  the  standard  unbiased  estimator  has 
been  found  to  generally  provide  the  best  choice  for  the  lag  estimates 
required  in  expression  (6.2)  .  Specifically,  the  required  autocorrelation  lag 
estimate  entries  are  generated  according  to 

N-n 

a  1  \  x(k+n)X(k)  (Kniq+t  (6.9) 

rx(n)  =  -  L 

N-n  k=l 

where  q+t  corresponds  to  the  largest  autocorrelation  lag  argument  appearing 

A 

in  matrix  Rj .  We  would  of  course  use  the  property  that 

cx(~n)  =  rx ( n )  to  obtain  any  negative  lag  autocorrelation  entries  which  may 

A 

be  needed  in  formulating  Rj .  In  using  this  unbiased  estimate  approach,  the 
resultant  autocorrelation  matrix  estimate  will  have  a  desirable  Toeplit? 
structure. 

The  (p+l)x(p+l)  matrix  RjWRj,  which  completely  characterizes  the 
autoregressive  parameter  vector  solution  through  expression  (6.6),  will  have 
components  which  are  readily  computable  from  the  estimates  (6.9) .  Using 
simple  matrix  manipulations,  it  is  readily  shown  that  the  general  (i,j)th 
element  of  this  matrix  is  specified  by 
t 

RjwRi ( i , j )  =  ^  w(m)r(q+m+l)  r(q+m+i-j)  for  1  £  i,j  (  p+1  (6.10) 

m=l 

where  the  w(m)  correspond  to  the  diagonal  elements  of  the  diagonal  weighting 

A  £  A 

matrix  W.  Upon  generation  of  the  matrix  R^WR^  according  to  this  expression, 
the  required  autocorrelation  parameter  vector  is  straightforwardly  obtained 
by  solving  the  system  of  linear  equations  (6.6).  A  Fortran  program  listing 
of  an  implementation  of  this  procedure  is  given  in  the  appendix  where  the 
flexibility  of  using  the  standard  unbiased  or  the  standard  biased  (i.e,, 
divisor  N-n  in  equation  (6.9)  is  replaced  by  N)  autocorrelation  estimate  is 
available. 


MA  Parameter  Estimation 

To  complete  the  ARMA  modeling,  it  is  necessary  to  compute  an  estimate 
for  the  moving  average  component  |Bq(eJ“)|2.  It  has  been  the  author's 
experience  that  independent  of  which  procedure  is  used,  this  MA  component 
estimate  is  almost  always  of  significantly  lower  quality  than  the  associated 
AR  component 

P  2 

|Ap(ej“)|2  =  |  ^  ak  eJ^j  (6.11) 

k=0 

in  which  ak  denote  the  autoregressive  parameter  estimates  as  generated  from 
expression  (6.6).  A  high  quality  low  order  MA  spectral  estimator  has  yet  to 
be  developed.  Despite  this  shortcoming,  some  reasonably  well  performing  MA 
estimators  will  now  be  briefly  discussed. 

Many  contemporary  MA  component  estimators  are  based  on  utilizing  the 
forward  and  backward  residual  time  series  associated  with  an  ARMA  time 
series.  In  particular,  the  forward  residual  time  series  elements  are 
computed  from  the  given  observations  (6.1)  and  the  autoregressive  parameter 
estimates  (6.6)  according  to 

P 

sf(n)  =  ^  ak  x(n-k)  p+1  i  n  i  N  (6.12) 

k=0 

Similarly,  the  backward  residuals  component  are  generated  using 
P 

®b(n)  =  ^  ak  x(n+k)  1  <.  n  £  N-p  (6.13) 

k=0 

As  indicated  in  Section  II,  each  of  these  residual  time  series  will  be 
governed  by  the  same  MA(q)  process  if  the  time  series  (x(n)}  is  an  ARMA(p,q) 
process  with  autoregressive  parameters  ak.  With  this  in  mind,  a  procedure 
for  extracting  this  MA  characterization  from  the  computed  forward  and 
backward  residuals  will  now  be  given. 

The  most  direct  procedure  for  achieving  the  required  MA(q)  estimate  is 
to  first  generate  the  following  estimates  of  the  residual  time  series'  first 


q+1  autocorrelation  lags 


rs(n) 


1 

N-p-n 


N-p-n 

^  [sf (n+p+k)7f(p+k) 
k=l 


+  Sb(n+k)  sb<k)] 


Oiniq 

(6.14) 


If  the  residual  time  series  do  in  fact  correspond  to  a  MA(q)  process,  it  will 
be  found  that  the  rs(n)  will  be  approximately  zero  for  n^q+1 .  This  can  be 
used  as  a  convenient  test  for  the  appropriateness  of  the  ARMA  model,  the 
order  selection,  and,  the  estimates  aj^.  In  any  case,  upon  taking  the  Fourier 
transform  of  these  autocorrelation  lags,  we  obtain  the  MA(q)  spectral 
estimate  component 

q 

lBq(ej"»)l2  =  ^  w(n)  ^(nje-jwn  (6.15) 

n=-q 


in  which  w(n)  is  a  window  sequence  and  use  of  the  fact  that  rs(-n)  =  rs(n) 
will  be  made  when  evaluating  (6.14).  The  overall  ARMA(p,q)  spectral  estimate 
is  then  given  by 


S(ej«») 


|Bq(ej<*>) 


Uo'(ei“) 


(6.16) 


A 

where  ApfeJ10)  is  specified  by  expression  (6.11). 

A  few  words  are  now  appropriate  concerning  the  selection  of  the  window 
to  be  used  in  estimate  (6.14).  If  the  rectangular  window  choice  w(n)  -  1  is 
made,  this  estimate  will  not  have  the  desired  property  of  being  guaranteed 
positive-semidef inite.  To  achieve  this  positive-semidefiniteness,  we  could 
instead  choose  the  window  to  be 


w(n) 


(6.17) 


Unfortunately,  this  selection  can  give  rise  to  a  seriously  distorted  MA 
estimate  in  view  of  the  triangular  like  weighting  thereby  employed.  The 
selection  of  w(n)  is  quite  important  and  this  choice  should  be  based  on  the 
particular  application  at  hand  and  user  experience. 


It  is  also  possible  to  employ  the  smoothed  periodogram  to  obtain  another 
form  of  MA(q)  estimate.  This  entails  segmenting  the  computed  residuals  in 
blocks  of  length  q+1  (overlapping  or  not  overlapping)  and  then  averaging  the 
resultant  q+1  length  periodograms  for  each  of  these  blocks.  This  procedure 
has  been  employed  with  a  moderate  degree  of  success  [17].  Similarly,  we 
could  make  obvious  adaptions  of  the  procedures  treated  in  Section  II  under 
the  ARMA  modeling  subsection  to  achieve  alternate  MA  estimates.  For  example, 
if  we  were  to  use  the  procedure  as  characterized  by  expression  (2.38), 
estimates  for  the  cn  parameters  would  be  computed  from 

n-1 

cn  -  J  **  rx(n-k)  1  i  n  £  p  (6.18) 

k=0 

The  required  ARMA  spectral  estimates  would  then  be  given  by  incorporating 
these  estimates  into  expression  (2.33)  to  result  in 

A 

Sx(ej<*>)  =  rx(0)  +  2Re  [D(eJ“)l  (6.19) 

where  D(eJw)  is  obtained  by  substituting  the  a^  and  cfc  estimates  into  form 
(2.34) . 
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VII.  ARMA  Modeling:  A  Singular  Value  Decomposition  Approach 

We  have  yet  to  address  the  important  issue  of  ARMA  model  order 
determination.  In  particular,  whether  one  is  provided  with  exact 
autocorrelation  lags  or  time  series  observations  for  effecting  the  modeling, 
how  one  chooses  appropriate  values  for  the  order  parameters  p  and  q  remains 
an  open  question.  It  is  recognized  that  this  model  order  information  is 
implicitly  contained  in  the  autocorrelation  matrices  which  characterize  ARMA 
models.  In  this  section,  we  shall  present  a  procedure  for  extracting  the 
crerequisite  model  order  values  which  will  make  use  of  a  singular  value 
decomposition  of  an  extended  autocorrelation  matrix.  An  important  byproduct 
of  this  procedure  will  be  an  adaption  of  the  ARMA  modeling  procedure  of  the 
previous  section  which  provides  for  a  significant  improvement  in  spectral 
estimation  performance. 

When  the  ARMA  model  order  parameters  are  not  known  apriori,  it  will  be 
judicious  to  select  the  initial  model  order  to  be  much  larger  than  the 
'anticipated'  order.  In  particular,  let  us  consider  the  extended  order  ARMA 

(Pe«<le)  model  for  which  pe  is  selected  to  be  larger  (usually  much  larger) 
than  the  eventual  model  order  parameter  p  to  be  used.  Although  we  typically 
do  not  know  p  apriori,  it  is  generally  possible  to  make  an  educated  guess  of 
p  so  as  to  ensure  that 

Pe  >  P  (7.1) 

In  accordance  with  expression  (2.23)  ,  it  then  follows  that  the  tx(pe+l) 
extended  order  autocorrelation  matrix  associated  with  this  ARMA(pe,qe)  model 
may  be  expressed  as 

rx  ( Re+1 )  I’x<1e)  •  •  •  rx(qe-Pe+l) 

rx(le+D  tjlqe+l)  ...  ^(^"Pe^) 


Re  = 


*x(qe+t) 


rx(qe+t“l> 


rxUe'Pe+t) 


(7.2) 


If  the  autocorrelation  lag  entries  used  in  this  matrix  correspond  to  an 
ARMA  (p,q)  process  for  which  qe-pe  2.  q~p»  it  then  follows  from  the  results  of 
section  II  that  the  rank  of  the  tx(pe+l)  matrix  Re  will  be  p.  In  arriving  at 
this  result,  we  of  course  assume  that  t  is  selected  to  at  least  equal  p.  To 
determine  the  required  order  parameter  p,  we  then  simply  set  p  equal  to  the 


rank  of  Re  for  the  idealistic  case  in  whioh  exact  autocorrelation  lag 
information  is  available. 

To  obtain  the  ARMA  model's  (p+l)xl  antoregressive  parameter  vector  a 
from  this  extended  order  autocorrelation  matrix,  it  is  possible  to  appeal  to 
the  theoretical  developments  of  Section  II.  In  particular,  let  us  consider 
the  set  of  submatrices  of  Re  formed  from  any  p+1  contiguous  columns.  This 
set  of  tx(p+l)  matrices  is  specified  by 

Rfc  =  [submatrix  of  Re  composed  of  its  through 

p+kik  column  vectors  inclusively]  for  l£k£pe-p+l  (7.3) 

In  accordance  with  the  ARMA  model  extended  Yule-Walker  equation 
relationships,  it  is  readily  established  that  the  required  unique 

autoregressive  parameter  vector  a  will  satisfy  the  set  of  homogeneous 
relationships 

Rya  =  e  for  l<k£pe-p+l  (7.4) 

where  the  first  component  of  _a  is  constrained  to  be  one.  In  point  of  fact, 
expression  (7.4)  provides  a  matrix  representation  for  the  t  extended 

Yule-Walker  equations  (2.22)  defined  on  the  specific  indices  qe+2-k  £  n  £ 
qe+t+l-k.  It  is  important  to  note  that  this  conclusion  will  be  valid  only  if 
the  autocorrelation  lag  entries  used  in  forming  Re  correspond  to  an  ARMA 

(p,q)  process,  and,  the  order  parameters  are  such  that  Pe  2.  P  *ud  qe-pe  > 

q-p. 

We  shall  now  apply  this  rank  characterization  of  Re  to  the  practical 
problem  in  which  the  ARMA  modeling  is  to  be  based  only  on  the  time  series 
observations 

x(l).  x(2) . x(N)  (7.5) 

and  not  on  actual  autocorrelation  lag  information.  In  this  case,  it  will  be 
necessary  to  first  compute  autocorrelation  lag  estimates  from  these 
observations.  These  estimates  are  next  substituted  into  the  matrix  format 
(7.2)  to  in  turn  generate  the  extended  order  autocorrelation  matrix  estimate 
Re.  Since  the  autocorrelation  lag  estimate  entries  will  be  invariably  in 
error,  it  follows  that  the  matrix  Re  will  normally  have  full  rank  (i.e.,  min 
(Pe+l»t))  even  when  the  time  series  under  study  corresponds  to  an  ARMA  (p,q) 
process.  Nonetheless,  even  though  Re  will  have  full  rank,  its  'effective' 
rank  will  still  tend  to  be  p.  To  better  quantify  the  vague  term  'effective' 
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rank,  it  will  be  beneficial  to  introduce  the  principle  of  singular  Talus 
decomposition. 

Singular  Value  Decomposition 

In  a  variety  of  applications,  the  ultimate  objective  will  be  that  of 
solving  a  linear  system  of  equations.  The  matrix  associated  with  this  system 
of  equations  not  only  characterizes  the  desired  solution,  but,  it  will  also 
very  often  convey  dynamical  property  information.  With  this  in  mind,  it 
often  behooves  us  to  examine  the  salient  properties  of  this  characterizing 
matrix.  The  singular  value  decomposition  of  a  matrix  as  outlined  in  the 
following  theorem  serves  this  role  particularly  well  (e.g.,  see  ref.  [27]  and 
139]). 

Theorem  7.1:  Let  A  be  a  mxn  matrix  of  generally 
complex  valued  elements.  Then  there  exists  mxm 
and  nxn  unitary  matrices  U  and  V,  respectively, 
such  that* 

A  =  02V*  (7.6) 

where  X  is  a  mxn  matrix  whose  elements  are  zero 
except  possibly  along  its  main  diagonal.  These 
nonnegative  diagonal  elements  are  ordered  such 
that 

®11  ff22  ^  °hh  ^  0 

where 

h  =  min  (m, n) . 

The  diagonal  elements  are  commonly  referred  to  as  the  singular  values  of 

matrix  A.  It  is  well  known  that  the  nonzero  singular  values  will  correspond 
to  the  positive  square  roots  of  the  eigenvalues  of  the  nonnegative  Hermitian 
matrices  AA*  and  A*A.  Moreover,  the  columns  of  U  (or  V)  will  correspond  to 
the  appropriately  ordered  orthonormal  eigenvectors  of  the  nonnegative 
Hermitian  matrices 
AA*  (or  A*A) . 

The  singular  values  o^  convey  valuable  information  concerning  the  rank 
characterization  of  matrix  A.  This  is  readily  demonstrated  upon  considering 


*The  matrices  D  and  V  are  said  to  be  unitary  if  U~1  =  U*  and 
V"1  «  V*. 
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the  problem  of  finding  that  mxn  matrix  of  rank  k  which  will  heat  approximate 
A  in  the  Frobenions  norm  sense  (this  assumes  that  gyy  >  0  with  k  £  h) .  Hie 
Frobenions  norm  of  the  mxn  matrix  difference  A-B  is  defined  to  be 


llA  -  B|| 


5  ,aij-bij|2 
j=l 


(7.7) 


We  now  seek  to  find  that  mxn  rank  k  matrix  B  which  will  render  this  criterion 
a  minimum.  The  solution  to  this  approximation  problem  is  contained  in  the 
following  theorem  [27] 

Theorem  7.2:  The  nniqne  mxn  matrix  of  rank  k  £ 

Rank  [A]  which  best  approximates  the  mxn  matrix  A 
in  the  Frobenions  norm  sense  is  given  by 

A(k)  =  UZkv‘  (7.8) 

where  D  and  V  are  as  in  expression  (7.6)  while  2:^ 
is  obtained  from  "2.  by  setting  to  zero  all  but 
its  k  largest  singular  values.  The  quality  of 
this  optimum  approximation  is  given  by 


llA  -  A(k)|| 


r  h  11/2 

}  °jj2  0<kih 
J=k+1 


(7.9) 


The  degree  to  which  A^  approximates  A  is  seen  to  be  dependent  on  the 
sum  of  the  (hr-k)  smallest  sinaular  values  squared.  As  k  approaches  h,  this 
sum  will  become  progressively  smaller  and  will  eventually  go  to  zero  at  k  * 
h.  In  order  to  provide  a  convenient  measure  for  this  behavior  which  does  not 
depend  on  the  size  of  matrix  A,  let  us  consider  the  normalized  ratio 


"I  (k)  = 


llAOOII 

llAll 

ffll2  +  ff222  +  •  •  •  +  ffkk2"!1^2 
®112  +  ®222  +  •  •  •  +  ®hh2  J 


1  i  k  i  h 


(7.10) 


Clearly,  this  normalized  ratio  approaches  its  maximum  value  of  one  as  k 
approaches  h.  For  matrices  of  low  effective  rank,  the  quantity  (k)  is  dose 
to  one  for  values  of  k  significantly  smaller  than  h.  On  the  other  hand. 
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matrices  for  which  k  most  take  on  high  values  (i.e.,  k  h)  to  achieve  a  (k) 
near  one  are  said  to  be  of  high  effective  rank. 

Application  of  SVD  to  ARMA  Modeling 

To  determine  the  required  order  for  an  ARMA  model,  we  shall  now  make  a 
SVD  of  the  tx(pe+l)  extended  order  autocorrelation  matrix  estimate  Re,  that 
is 

Re  =  CSV*  (7.11) 

where  U  and  V  are  txt  and  (pe+l)x(pe+l)  unitary  matrices,  respectively  and 
ST  is  a  tx(pe+l)  matrix  of  the  form  called  out  in  Theorem  7.1.  The  required 
autoregressive  order  p  is  obtained  by  examining  the  normalized  ratio  K  (k) . 
Namely,  p  is  set  equal  to  the  smallest  value  of  k  for  which  ~Hk)  is  deemed 
'adequately'  close  to  one.  The  terminology  'adequately  close  to  one'  is 
subjective  and  will  depend  on  the  particular  application  under  consideration 
as  well  as  user  experience  gained  through  empirical  experimentation.  In  any 
case,  the  net  result  of  this  step  will  be  a  rank  p  optimum  approximation  of 
the  tx(pe+l)  extended  order  autocorrelation  matrix,  that  is 

Re(p>  =  0  Ip  V*  (7.12) 

A  simple  matrix  manipulation  reveals  that  this  rank  p  approximation  may  be 
equivalently  represented  as 
P 

®e^P^  =  ^  ®nn  fin  Xn*  (7.13) 

n=l 

where  u^  and  v^  are  the  ktR  column  vectors  of  the  txt  and  (pe+l)x(pe+l) 
unitary  matrices  U  and  V,  respectively.  We  shall  now  provide  two  separate 
procedures  for  using  this  rank  p  approximation  for  effecting  autoregressive 
parameter  estimates. 


Method  I:  ARMA  (pa,qe)  model 


In  this  approach,  the  rank  p  approximation  (7.12)  is  interpreted  as 
providing  an  improved  estimate  of  the  underlying  extended  autocorrelation 
matrix.  It  will  be  convenient  to  decompose  this  rank  p  approximation  as 
follows 

*e(p>  *  lri(P>  :  Ra<P>]  (7.14) 
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where  r^P)  is  the  left  most  txl  column  vector  of  Re^P)  and  Ra^P)  is  a  txpe 
matrix  composed  of  the  pe  right  most  txl  column  vectors  of  Re(p).  We  now 
seek  a  (pe+l)xl  autoregressive  parameter  vector  &  with  first  component  equal 
to  one  that  will  satisfy  the  theoretical  relationship 

Re<P>  S.  =  § 

Since  the  rank  of  Re(P^  is  less  than  full,  there  will  exist  an  infinity  of 
solutions  to  this  problem.  We  shall  select  the  minimum  norm  solution  as 
specified  by 


[Ra(p)]#  r.!<P> 


in  which  the  superscript  notation  #  denotes  the  operation  of  generalized 
(psuedo)  matrix  inversion.  This  autoregressive  parameter  selection  procedure 
has  proved  to  be  particularly  effective  in  low  SNR  environments.  It  is 
readily  shown  that  this  minimum  norm  solution  can  be  simplified  to 
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p 

-1  “  I  Vk(o)  Vk 


Pe 

1  -  }  lvk(o)|2 

k=l 
Pe 

^  vk(o)vi 

k=p+l 

Pe 

J  lvk(o)l2 

k=p+l 


(7.15) 


where  the  v^  correspond  to  the  colnmn  vectors  of  the  unitary  matrix  V 
appearing  in  the  SVD  representation  (7.12). 


Method  II:  A&MA  (p.q)  Model 


The  best  rank  p  approximation  matrix  (7.12)  contains  within  its  column 
structure  the  characteristics  required  to  estimate  autoregressive  parameters 
of  a  lower  order  ARMA(p,q)  model.  In  particular,  the  submatrices  of  Re(p) 
composed  of  its  columns  k  through  p+k  inclusively  yield  rank  p  approximations 
of  the  tx(p+l)  autocorrelation  matrices  R^  for  1  1  k  <. 
Pe  ~  P+1  ss  specified  by  expression  (7.3).  We  shall  denote  these  rank  p 
approximations  by  Rj^p^.  Due  to  the  SVD  operation  and  errors  inherent  in 
generating  Re,  there  will  generally  not  exist  a  unique  autoregressive 
parameter  vector  with  first  component  equal  to  one  which  will  satisfy  all  of 
the  pe-p+l  estimates  of  relationships  (7.4).  Nonetheless,  it  is  still 
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desirable  to  find  an  autoregressive  parameter  vector  for  which  each  of  these 
relationships  are  almost  satisfied.  A  functional  that  measures  the  degree  to 
which  this  is  accomplished  is  given  by 


f(a)  =  a*  S<P>a  (7.16a) 

where 

Pe~P+1 

S(P>  «  Rk(p)*Rk(p>  (7.16b) 

k=l 

The  (p+l)x(p+l)  matrix  S^P)  is  nonnegative  Hermitian  and  may  be  conveniently 
computed  using  the  relationship 

P  Pe-P+1 

s(p)  =  J  1  Znk  £nk*  (7.17) 

n=l  k=l 


in  which  v^k  denotes  the  (p+l)xl  vector  as  specified  by 

vnk  -  (vn(k) ,vn(k+l) ,  ....  vn(k+p)] *  (7.18) 

1  <  k  i  pe-p+l 
1  <  n  i  p 

This  vector  is  seen  to  be  a  windowed  segment  of  the  ntk  column  vector  (i.e., 
vn)  of  the  unitary  matrix  V  that  in  part  identifies  the  SVD  representation 
(7.11).  Moreover,  due  to  the  simple  shift  relationship  between  the  vectors 
X*  and  v*+1 ,  it  is  possible  to  devise  an  iterative  procedure  for  updating  the 

u  u  j,  j, 

(p+l)x(p+l)  matrices  v^v^*  as  k  evolves.  This  will  entail  (p+1)  computations 
for  each  value  of  k. 

Upon  generating  the  (p+l)x(p+l)  matrix  S^P^,  we  next  wish  to  select  that 
autoregressive  parameter  vector  a  with  first  component  of  one  so  as  to 
minimize  quadratic  functional  (7.16).  This  constrained  minimization  will 
result  in  the  best  least  square  approximation  of  the  theoretical 
relationships  (7.4).  Using  standard  procedures,  the  required  optimum 


autoregressive  parameter  vector  is  found  by  solving  the  following  linear 
system  of  equations! 

S(p)a  =  a  e i  (7.19) 

in  which  the  normal iz ing  constant  a  is  selected  so  that  the  first  component  of 
a.  is  one.  It  will  be  shown  in  the  next  section  that  this  SVD  version  of  the 
ARMA  modeling  procedure  can  lead  to  a  significant  improvement  in  modeling 
performance . 

The  concept  of  a  SVD  representation  has  been  previously  incorporated 
with  success  in  effecting  AR  models  [42]  and  [61].  Incorporation  of  an  SVD 
AR  model  was  there  shown  to  produce  an  increase  in  spectral  resolution 
capabilities.  More  recently,  the  SVD  representation  was  used  in  ARMA 
modeling  where  impressive  results  were  reported  [22].  Undoubtedly,  the  impact 
which  SVD  will  ultimately  have  on  spectral  estimation  (and  in  other 
applications)  is  only  beginning  to  be  appreciated. 


*In  those  rare  cases  where  S  p  is  singular,  the  required  autoregressive 
parameter  vector  is  set  equal  to  an,  appropriately  normalized  eigenvector 
associated  with  a  zero  eigenvalue  of  Svp  . 
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VIII.  Numerical  Examples 

In  this  section,  we  shall  investigate  the  comparative  spectral 
estimation  performance  of  the  rational  modeling  procedures  as  developed  in 
Sections  VI  and  VII  with  those  of  popularly  used  alternatives.  The  first 
example  will  treat  the  problem  of  effecting  a  rational  spectral  estimate  from 
a  set  of  observations  of  an  ARMA(4,4)  process.  In  the  second  example,  we 
shall  examine  the  modeling  performance  for  the  special  case  of  sinusoids  in 
white  noise. 

Example  1:  In  this  example,  we  shall  examine  the  time  series  as 

characterized  by  (see  ref.  [11]) 

x(n)  =  x1(n)  +  X2<n)  +  0.5  e(n)  (8.1a) 

which  is  composed  of  the  two  AR(2)  time  series  generated  according  to 

*l(n)  =  0.4  xi(n-l)  -  0.93  xi(n-2)  +  ejtn)  (8.1b) 

X2(n)  =  -0.5  X2(n-1)  -  0.93  X2(n-2)  +  82(n)  (8.1c) 

where  e(n),  ei(n)  and  e2(n)  are  mutually  uncorrelated  Ganssian  zero  mean 
white  noise  processes  with  variance  one.  A  simple  analysis  indicates  that 
the  power  spectral  density  function  associated  with  time  series  (8.1)  is 
given  by 

Sx(w)  -  |l  -  0.4e~j“  +  O^e-jZwr2  + 

ll  +  0.5e'j«  +  0.93e-j*»r2  +  0-25  (8.2) 

and  is  plotted  in  Figure  7.1a. 

Using  the  time  series  description  (8.1),  twenty  statistically 

independent  realizations  each  of  length  125  were  next  generated.  These  20 
realizations  were  then  used  to  compare  the  modeling  effectiveness  of  this 
paper's  method  with  the  Box-Jenkin's  maximum-likelihood  method.  The  twenty 
(one  for  each  realization)  superimposed  ARMA  (4,4)  spectral  estimates 
obtained  using  the  Box-Jenkins  iterative  method  are  shown  in  Figure  8.1b. 
The  number  of  iterations  required  to  achieve  these  estimates  ranged  from  10 
to  700  with  50  being  a  typical  requirement.  Next,  this  paper's  method  as 

represented  by  expression  (6.6)  with  unbilled  autocorrelation  lag  estimates 

and  W  =  I  w as  used  to  obtain  the  ARMA  (4,4)  model's  autoregressive 

coefficients.  Relationship  (6.15)  with  the  window  selection  (6.17)  was  used 
in  forming  the  MA  component  of  the  spectral  estimates.  The  twenty 
superimposed  ARMA  (4,4)  spectral  estimates  thereby  obtained  are  shown  in 
Figures  8.1c,  8. Id,  and  8.1e  for  various  choices  of  t.  From  these  plots,  it 


61 


is  apparent  that  progressively  improved  estimates  are  achieved  npon 
increasing  t  from  its  minimal  valae  of  4,  to  6,  and  then  to  20.  Moreover, 
these  spectral  estimates  were  of  higher  quality  than  those  obtained  with  the 
maximum-likel ihood  method  which  exhibited  a  larger  variance  in  estimate. 

Example  2:  In  this  example,  we  shall  investigate  the  comparative 

spectral  estimation  performances  of  various  widely  used  methods  on  the 
classical  sinusoids  in  additive  white  noise  problem.  The  particular  time 
series  to  be  considered  is  given  by 

x(n)  =  sin(2nfln)  +  sin(2»tf2n)  +  w(n),  l(n(N  (8.3) 

fj  =  0.2,  f2  =  0.215,  =  0.5 

This  time  series  was  previously  examined  in  Section  III  where  different 
rational  models  were  generated  from  the  ’exact'  autocorrelation  lags 
associated  with  it.  This  is  a  particularly  appropriate  time  series  for 
testing  the  resolution  capabilities  of  spectral  estimators  because  of  the 
closeness  of  the  sinusoidal  frequencies  (i.e.,  fj-fj  =  0.015)  and  the 
prevailing  low  signal-to-noise  ratio  of  zero  dB  (individual  sinusoid  power  to 
total  noise  power). 

In  order  to  gain  a  reasonable  good  statistical  basis  for  comparison,  ten 
statistically  independent  realizations  of  the  time  series  (8.3)  were 
generated  with  each  realization  being  of  length  128  (i.e.,  N  =  128).  Dsing 
these  ten  different  sets  of  time  series  observations,  ten  spectral  estimates 
were  made  for  various  widely  used  rational  spectral  estimators.  The 
resultant  ten  spectral  estimates  for  each  estimator  were  then  plotted  in 
Figures  8.2  to  8.7  in  a  superimposed  fashion  (except  for  the  periodogram)  so 
as  to  depict  consistency  of  estimate.  The  ideal  estimate  would  of  course  be 
two  sharply  defined  peaks  at  frequencies  0.2  and  0.215.  A  brief  description 
of  the  different  estimators  and  their  performance  on  these  test  samples  is 
now  given. 

MA  Estimates:  The  periodogram  as  implemented  by  the  fast  Fourier 

transform  was  first  used  in  generating  spectral  estimates  for  each  of  the  ten 
different  128  data  length  realizations.  Specifically,  expression  (4.7)  with 
N  -  128  was  incorporated  into  the  MA  spectral  estimator  (4.5)  to  generate  the 
sampled  periodogram  estimate 


62 


1 

N 


N-l 

J 

n=0 


x(n+l)  e-j-“ 


0  i  k  i  N-l 


(8.4) 


It  was  found  that  each  of  the  ten  periodograms  produced  remarkably  similar 
results.  A  typical  128  point  FFT  periodogram  for  one  of  these  trials  is 
shown  in  Figure  8.2a.  From  this  plot  (and  the  nine  others  not  shown),  it  was 
not  possible  to  unambiguously  detect  the  presence  of  two  spectral  peaks  at 
frequencies  0.2  and  0.21S. 

In  order  to  ease  the  potential  ambiguity  created  by  the  finite  frequency 
sampling  of  the  periodogram  (i.e.,  Aw  =  2n/N)  ,  the  concept  of  padding  as 
described  in  Section  IV  was  next  incorporated.  Using  this  approach,  the 
original  time  series  observation  of  length  128  was  next  appended  with  128 
zeroes.  The  resultant  256  point  padded  FFT  periodogram  is  shown  in  Figure 
8.2b.  In  this  padded  case,  we  are  able  to  unambiguously  detect  the  presence 
of  the  two  spectral  peaks  at  0.2  and  0.215.  A  further  padding  of  256  zeroes 
is  found  to  result  in  the  512  point  padded  FFT  periodogram  shown  in  Figure 
8.2c.  The  prerequisite  spectral  resolution  is  again  achieved. 

AR  Estimates:  In  AR  modeling,  the  most  widely  used  procedure  is  the 

Burg  algorithm.  With  this  in  mind,  the  Burg  algorithm  was  next  used  to 
generate  spectral  estimates  for  each  of  the  aforementioned  ten  observation 
sets  of  length  128.  The  ten  superimposed  Burg  AR(20)  estimates  which 
resulted  are  depicted  in  Figure  8.3a.  Although  a  detection  of  spectral 
energy  in  the  region  about  f  =  0.2  is  evident,  the  appearance  of  two  spectral 
peaks  is  not.  The  ordering  selection  p=20  was  evidently  not  sufficient  for 
the  required  resolution.  Upon  increasing  the  AR  order  to  p  =  24,  however, 
the  Burg  AR(24)  estimates  produced  two  reasonably  well  defined  peaks  about  f 
=  0.2  and  f  =  0.215  in  nine  out  of  the  ten  estimates.  These  estimates  are 
plotted  in  superimposed  fashion  in  Figure  8.3b.  It  was  further  determined 
that  more  sharply  defined  peaks  are  achieved  in  all  ten  estimates  when  the 
order  was  increased  to  forty.  The  Burg  algorithm  is  then  seen  to  provide  a 
satisfactory  resolution  performance  for  the  time  series  under  study  provided 
that  the  AR  order  is  selected  to  be  on  the  order  of  24  or  more. 

In  order  to  demonstrate  the  effect  of  using  more  than  the  minimal  number 
of  extended  Yule-Walker  equations  in  arriving  at  an  AR  model  (the  Burg 
algorithm  uses  the  minimal  number),  the  ARMA  modeling  technique  as  embodied 
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in  expression  (6.6)  with  W  =>  I  and  unbiased  autocorrelation  lag  estiaates  was 
next  used  with  p  =  20,  q  =  0,  and,  t  ■  50.  The  resultant  ten  AR(20)  spectral 
estimates  which  arose  when  using  this  approach  are  shown  in  Figure  8.3c.  A 
resolution  of  the  two  sinusoids  was  achieved  in  all  ten  estimates.  It  is 
significant  that  the  lower  order  AR(20)  spectral  estimates  as  generated  using 
this  paper's  method  provided  more  sharply  defined  peaks  than  the  higher  order 
Burg  AR(24)  spectral  estimates.  This  is  primarily  due  to  the  fact  that  fifty 
extended  Yule-Walker  equations  were  used  in  specifying  the  20  autoregressive 
parameters.  Tho  degree  of  smoothing  achieved  in  applying  this  approach  is 
evident  from  this  numerical  example. 

ARMA  Estimates:  The  ARMA  modeling  procedure  as  represented  by 

expression  (6.6)  with  W  =  I  and  unbiased  autocorrelation  lag  estimate  entries 
was  next  used  to  generate  estimates  of  the  autoregressive  coefficients  of  an 
ARMA(p,p)  model  for  p  =  8  and  12.  In  accordance  with  the  results  of  Section 
III,  plots  of  |Ap(ei“)l-2  were  then  made  so  as  to  reveal  the  required 

spectral  information  for  the  sinusoids  in  white  noise  case  (i.e.,  the  zeroes 
are  not  used).  In  Figure  8.4a,  the  ten  AR(8,8)  spectral  estimates  which 
arose  for  a  choice  of  t  =  70  are  shown  superimposed.  Although  spectral 
energy  in  the  neighborhood  of  f  =  0.2  is  detected,  the  presence  of  the 
required  two  spectral  peaks  is  not.  Clearly,  the  order  selection  p=8  was  not 
sufficient  to  achieve  the  desired  resolution.  Upon  increasing  the  order  to 

ARMA  (12,12)  and  retaining  t  =  70,  however,  the  resultant  ten  spectral 

estimates  shown  in  Figure  8.4b  each  achieved  the  desired  spectral  resolution 
with  two  sharply  defined  peaks  about  f  =  0.2  and  f  *  0.213,  These  spectral 
estimates  have  been  obtained  with  but  twelve  autoregressive  parameters,  and, 
are  seen  to  be  significantly  superior  to  the  Burg  AR(24)  estimates  which 
required  twenty-four  autoregressive  parameters.  In  terms  of  spectral 
estimation  fidelity  and  parameter  parsimony  (i.e.,  effective  use  of 
parameters),  it  is  clear  that  the  ARMA  modeling  method  herein  developed  has 
provided  a  superior  performance  for  the  problem  at  hand. 

A  truly  significant  increase  in  spectral  estimation  performance  is 
achieved  upon  adopting  the  SV0  approaches  to  ARMA  modeling  as  outlined  in 
Section  VII.  Namely,  after  setting  pe  =  qe  =  14  and  t  =  50,  it  was  found 
that  the  effective  rank  of  the  extended  order  autocorrelation  matrix  estimate 
Re  was  four.  Setting  p=4  and  using  relationship  (7.15),  the  ten  ARMA(14,14) 
spectral  estimates  which  arose  are  shown  in  Figure  8.4c.  Next,  letting  p*4 
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in  expression  (7.17) >  the  ten  SVD  derived  ARMA(4,4)  spectral  estimates  which 
arose  are  shown  superimposed  in  Figure  8.4c.  These  spectral  estimatea  are 
not  only  of  uniformly  high  quality,  but,  they  represent  the  lowest  order 
rational  model  which  is  compatible  with  the  two  sinusoids  in  white  noise 
case.  Moreover,  the  quality  of  the  peak  frequency  estimatea  and  associated 
pole  magnitude  (theoretically  equal  to  one)  estimates  is  exceptional  as  shown 
in  Table  8.1.  The  quantities  ffc(pfc)  and  «fj.2  (<rp^2)  for  *  =  1*2  represent 
the  sampled  means  and  variances,  respectively,  of  the  peak  frequencies  (pole 
magnitudes)  as  determined  from  the  ten  spectral  estimates. 


k  *k 

?k 

*fk 

bkl 

olpkl 

1  0.20 

0.1998 

0.0012 

0.9944 

0.0062 

2  0.215 

0.2159 

0.0011 

0.9974 

0.0080 

Table  8.1: 

Statistics  of  SVD 

ARMA  (4,4) 

estimates. 

To  demonstrate  the  worth  of  singular  values  in  model  order  determination 
when  using  the  SVD  approach,  the  fifteen  singular  values  which  characterized 
the  extended  order  autocorrelation  matrix  estimate  Re  for  one  of  the  ten 
observation  sets  are  now  given 

=  18.3  ,  =  18.2  ,  ~~  5.30  ,  “  4.69 

*  0.85  ,  =  0.78  ,  «  .  .  ,  Oj j  s  0.21 

It  is  apparent  that  the  first  four  singular  values  are  dominant  (i.e.,  (4)  « 

0.995)  thereby  indicating  that  the  effective  rank  of  Re  is  four.  Thus,  the 
correct  selection  of  ARMA  order  p  =  q  =  4  is  made  upon  examination  of  the 
singular  values  behavior. 

Alternative  Method:  In  Section  III,  an  alternate  method  for  detecting  and 
estimating  the  frequencies  of  sinusoids  in  white  noise  was  proposed.  This 
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method  is  represented  by  the  least  squares  solution  (3.9)  to  an  overdetermined 
system  of  linear  equations.  Using  this  expression  with  a  selection  of  p  *  14, 
t  =  50,  f  =  I,  and,  unbiased  autocorrelation  lag  estimate  entriea  for  R,, 
spectral  estimates  for  each  of  the  ten  time  series  of  length  128  were 
generated.  The  results  of  these  estimates  are  depicted  in  superimposed  style 
in  Figure  8.5  in  plots  of  1/ lAj4(eJw) |2 .  Two  sharply  resolved  peaks  are 
achieved  in  each  of  the  ten  estimates.  It  is  noteworthy  that  this  procedure 
provided  good  estimates  for  a  low  order  choice.  A  paper  now  in  preparation 
will  demonstrate  the  exceptional  performance  of  this  new  procedure  for  a  more 
general  class  of  deterministic  signals  in  white  noise.  Improvement  is  thereby 
achieved  by  making  an  estimate  of  the  white  noise  variance  o^  using  expression 
(3.2)  at  n=0,  then  subtracting  this  estimate  from  the  rz(0)  term  and  then  using 
an  SVD.  Initial  empirical  evidence  suggests  that  this  new  approach  provides 
significantly  better  performance  than  the  Pisarenko  method  [55]  and  its 
varients,  and,  the  Kumaresan- Tufts  approach  [42], [61]. 

Comparison  with  the  Kumaresan-Tuf ts  Method 

We  shall  now  consider  a  time  series  of  form  (8.3)  in  which  the  relevant 
parameters  are  given  by 

fi  =  0.2,  f2  -  0.21,  «w2  =  1.778 

This  particular  parameter  choice  provides  a  more  challenging  test  of  resolution 
capability  in  that  the  frequency  spacing  f2-f2  =  0.01  is  smaller  and  the  SNR  of 
-5dB  is  lower  than  that  of  time  series  (8.3).  Again  ten  statistically  sample 
runs  each  of  length  128  were  used  for  testing  four  AR  type  models.  In  the 
first  AR  model,  expression  (7.2)  with  choices  of  qe  =  -l,  pe  *  35,  t  -  90 
(giving  90  YW  equation  approximations)  were  made.  Unbiased  autocorrelation 
estimates  were  then  used  to  form  the  90  x  36  matrix  estimate  Re.  Finally,  the 
optimum  autoregressive  parameter  estimates  were  generated  upon  using  expression 
(7.15).  The  resultant  ten  AR(35)  spectral  estimates  are  shown  in  superimposed 
plots  in  Fig.  8.6a  where  resolution  was  achieved  in  each  of  the  ten  rnns. 
Next,  the  extended  autocorrelation  matrix  model  (7.2)  with  qe  «■  -1,  pe  «  96,  t 
=  96,  and  unbiased  autocorrelation  lags  was  tested.  Expression  (7.15)  with  p  ” 
4  was  then  used  to  generate  the  ak  estimates  of  the  AR(96)  model.  A  plot  of 
the  resultant  spectra  is  shown  in  Fig.  8.6b  where  resolution  was  achieved  for 
each  of  the  ten  runs. 
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The  Kumaresan-Tuf ts  method,  which  provides  a  near  maximum-likelihood 
performance,  was  next  tested  on  these  same  ten  sample  runs  [61] .  The  resultant 
AR  type  35^h  and  96 (the  optimum  KT  order  choice)  order  spectra  are  shown 
plotted  in  Figs.  8.6c  and  8.6d,  respectively.  The  35th  order  model  was  unable 
to  resolve  the  sinusoids  in  any  of  the  ten  runs  while  the  96th  order  model 
achieved  a  resolution  in  each  case.  For  this  example,  it  is  apparent  that  the 
overextended  modeling  approach  advocated  in  this  paper  has  outperformed  the 
pseudo  maximum-likelihood  Kumaresan-Tuf ts  method.  Moreover,  the  computational 
efficiency  of  this  paper's  overextended  modeling  method  (7.15)  is  far  superior 
as  will  be  documented  in  a  forthcoming  paper. 

Adaptive  ARMA  Modeling 

As  a  final  example,  the  adaptive  ARMA  modeling  procedure  to  be  developed 
in  Section  X  was  applied  to  the  time  series  (8.3)  in  which  the  covariance  mode 
(kj  =  40,  k2  =  1)  was  selected  with  ARMA  order  p=12.  The  spectral  estimates  of 
five  independent  runs  at  data  lengths  N  =  128,  N  =  256  and  N  =  1024  are  shown 
superimposed  in  Figure  8.7.  From  these  plots,  it  is  apparent  that  the  twelfth 
order  ARMA  model  detects  the  presence  of  spectral  energy  in  the  neighborhood  of 
f  =  0.2  at  data  length  N  =  128,  but,  the  resolution  of  two  spectral  peaks  is 
somewhat  unsatisfactory.  As  the  ARMA  model  adapts  to  the  data,  however,  two 
well  defined  spectral  peaks  appear  at  N  2  256.  The  model  has  therefore  adapted 
to  the  signal  using  less  than  256  time  series  observations. 

To  illustrate  the  performance  of  this  adaptive  ARMA  approach  relative  to 
popularly  used  methods,  the  classical  adaptive  AR  covariance  method  to  be 
developed  in  section  IX  was  next  used  on  the  same  set  of  time  series 
observations.  The  five  spectral  plots  which  arose  for  an  AR(22)  model  are 
shown  superimposed  in  Figure  8.8  at  N  2  128,  N  =  256,  and,  N  2  1024.  Clearly, 
the  higher  order  covariance  AR  model  was  unable  to  satisfactorily  resolve  the 
two  sinusoids  even  at  data  length  N21024.  Thus,  the  lower  order  ARMA  (12,12) 
covariance  adaptive  model  significantly  outperformed  the  higher  order  AR(22) 
covariance  adaptive  model.  This  is  indeed  noteworthy  when  it  is  realized  that 
some  of  the  more  widely  used  adaptive  filters  utilize  the  AR  covariance  model. 
This  includes  the  fast  LMS  algorithm  of  Morf  [25]  ,[46]  ,[47]  and  the 
approximating  gradient  approach  of  f idrow  [65] . 
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Fig.  8.2  Moving  average  (MA)  spectral 
estimates  using  the  FFT 
implementation  of  the  periodogram 
with  128  time  series  observations 
(a)  no  zero  padding,  (b)  128  zero 
padding,  (c)  384  zero  padding. 
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.  8.3  Autoregressive  (AR)  spectral  estimates 
from  128  time  series  observations 
(a)  p  =  20  Burg  estimate,  (b)  p  =  24 
Burg  estimate  (c)  This  paper’s  method 
(6.6)  with  q  =  0,  p  =  20,  t  =  56 
estimate  using  expression  (6.6). 
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IX.  AS  Modeling:  Adaptive  Implementation 

In  section  V,  a  general  procedure  for  effecting  an  AS  model  which 
represents  a  set  of  given  time  series  observations  4 

x(l).  x(2) . x(N)  (9.1) 

was  presented.  It  was  there  shown  that  the  required  modeling  entailed  using 
these  time  series  observations  to  generate  estimates  for  the  entries  of  the 
(p+l)x(p+l)  AS  autocorrelation  matrix  as  specified  by 

R<  i*  j )  =  rx(i”j>  1  i  »,j  i  P+1  (9.2) 

From  a  performance  viewpoint,  the  unbiased  autocorrelation  estimate  (S.4)  was 
suggested  as  a  logical  choice  for  estimating  these  entries.  In  any  case, 
once  estimates  for  the  R(i,j)  elements  have  been  made,  the  required  AR(p) 
model  parameters  are  obtained  by  solving  the  linear  system  of  equations 

A 

R  a  =  lb0|2  ej  (9.3) 

where  it  will  be  recalled  that  the  paramettr  b0  is  chosen  so  that  the  first 
component  of  a  is  one. 

In  applications  requiring  a  continuous  updating  of  the  AR  model 
parameters  as  new  time  series  observations  become  available  (i.e.,  x(N+l), 
x(N+2),  ...),  however,  the  standard  unbiased  estimator  approach  poses  a 

serious  computational  burden.  To  overcome  this  difficulty,  it  behooves  us  to 
seek  alternate  autocorrelation  estimators  which  are  more  amenably  to  an 
adaptive  solution.  With  this  objective  in  mind,  we  shall  now  consider  the 
adaptive  class  of  autocorrelation  estimators  as  defined  by 

N+k2-l 

R( i»  j )  =  1  \  x( k+l-i)x(k+l-j )  l<iiP+l  (9.4) 

N+k2-ki  k=kj  l<j<P+l 

in  which  the  convention  of  setting  to  zero  any  summand  terms  x(n)  whose 
argument  lies  outside  the  observation  set  l£n£N  has  again  been  adopted. 

Although  this  expression  might  initially  appear  to  be  unduly  contrived,  it 
does  provide  us  with  an  autocorrelation  estimate  of  rx(i-j)  as  called  out  for 

in  expression  (9.2).  More  importantly,  however,  this  estimator  will  be 

shortly  shown  to  have  a  most  convenient  matrix  product  representation. 

The  integer  constants  kj  and  k2  which  characterize  the  autocorrelation 
lag  estimator  rule  (9.4)  are  to  be  selected  so  that  the  number  of  lag 

products  there  used  (i.e.,  N+k2~kj)  at  least  equals  p+1 .  This  requirement 


will  generally  ensure  the  invertibility  of  the  autocorrelation  matrix 

A 

estimate  R  associated  with  the  estimatea  (9.4).  In  most  cases  of  interest, 
these  constants  are  further  confined  to  the  range  ljki .k2<p+l  although  other 
choices  are  permissable.  It  then  follows  that  each  member  of  the  adaptive 
autocorrelation  estimator  class  will  be  identified  by  a  specific  two-tuple 
(kj,k2)1.  Moreover,  each  such  estimator  will  provide  a  generally  different 
set  of  autocorrelation  estimates  from  the  given  set  of  time  series 
observations  (9.1). 

Members  of  the  adaptive  class  of  autocorrelation  estimators  have  a 
particularly  convenient  algebraic  representation  which  we  shall  employ  when 
effecting  the  promised  adaptive  implementation.  Specifically,  the 
(p+l)x(p+l)  autocorrelation  matrix  estimate  that  arises  upon  using  the 
estimates  (9.4)  as  entries  can  be  always  expressed  in  the  following  data 
matrix  product  format 


R  “  - - — 

N+k2-kl 


(9.5) 


in  which  %  is  the  (N+k2-ki)x(p+l)  data  matrix  whose  individual  elements  are 
specified  by 

%(i.j)  =  x(ki+i-j)  lli^N+k2-ki  (9.6) 

l<j£p+l 

We  have  here  appended  the  subscript  N  to  the  data  matrix  so  as  to  explicitly 
recognize  its  dependency  on  the  data  length.  The  incorporation  of  this 
subscript  will  be  also  useful  when  obtaining  the  promised  adaptive 
implementation.  A  straightforward  analysis  will  demonstrate  the  equivalency 
of  expressions  (9.4)  and  (9.5).  The  data  matrix  is  seen  to  have  elements 
whose  entries  are  the  given  time  series  observations  (9.1)  as  well  as  zeroes 
which  appear  whenever  the  time  index  argument  (kj+i-j)  falls  outside  the 
observation  set  l£n£N. 

It  is  possible  to  provide  a  revealing  visual  interpretation  to  the 
concept  of  data  windowing  for  this  class  of  estimators.  In  particular,  let 
us  consider  the  following  (N+p)x(p+l)  kernel  Toeplitz  type  matrix  which 


lAs  we  will  shortly  see,  the  four  most  widely  used  members  of  this  class  are 
the  covariance  method  (kj=p+l,  k2=l).  the  autocorrelation  method  (kj=l, 
k2=p+l),  the  prewindow  method  (k2Bl,  k2=l) ,  and,  the  postwindow  method 
(kj=p+l  ,k2*>p+l) . 
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contains,  as  submatrices,  all  of  the  data  matrices  associated  with  the 
adaptive  class  of  autocorrelation  estimators. 


_  *1  =  1 

Prewindowing 
_  *1  =  P+1 

-  *2  =  1 

Postwindowing 
-  k2  =  p+1 


(9.7) 


Upon  examination  of  the  data  matrix  definition  (9.6),  it  is  apparent  that 
may  be  identified  with  that  submatrix  of  jq  composed  of  its  k^®*  through 
(N+k2-l)st  rows,  inclusively.  Thus,  corresponding  to  each  adaptive 
autocorrelation  estimator  (i.e.,  pair  (k^,k2)),  we  may  obtain  the  associated 
data  matrix  using  this  row  identification  scheme. 

The  zeroes  which  appear  in  the  upper  right  corner  of  the  kernel  matrix 
.jj  are  there  due  to  the  implicit  assumption  that  x(n)  =  0  for  -p+l£n£0.  This 
rather  unrealistic  assumption  concerning  an  unobserved  segment  of  the  time 
series  is  coaunonly  referred  to  as  a  prewindowing  of  the  data.  It  is  seen 
that  a  degree  of  prewindowing  is  incorporated  whenever  the  constant  kj  is 
selected  such  that  l<k^<p.  Normally,  such  choices  are  to  be  avoided  since 
they  will  generally  lead  to  relatively  poor  AR  modeling  due  to  the 
unrealistic  prewindow  assumption  thereby  being  made  on  the  time  series.  As 
kj  ranges  over  the  integers  1  to  p+1,  the  degree  of  prewindowing  incorporated 
varies  from  full  at  kj=l  to  none  at  kj=p+l .  This  prewindowing  behavior  is 
conveniently  depicted  in  expression  (9.7) . 

In  a  similar  fashion,  the  zeroes  which  appear  in  the  lower  left  corner 
of  matrix  7^  are  there  due  to  the  implicit  postwindow  assumption  that  x(n)  *  0 
for  N+l<_n£N+p.  This  equally  unrealistic  assumption  concerning  an  unobserved 
segment  of  the  time  series  is  to  be  generally  avoided.  A  degree  of 
postwindowing  is  incorporated  whenever  the  index  k2  is  chosen  to  lie  in  the 
range  2 Ik? < p+1 .  The  smallest  value  of  k2  for  which  the  postwindow  assumption 


is  avoided  is  seen  to  be  ^2=1 •  Thus,  as  the  index  k2  ranges  fro*  1  to  p+1 , 
the  degree  of  postwindowing  incorporated  varies  from  none  at  k2«l  to  full  at 
k2=p+l . 

The  four  most  widely  used  of  the  adaptive  autocorrelation  estimator 
methods  are  listed  in  Table  9.1  (e.g.,  see  refs.  [25] ,[46] ,[47] ) .  Each  of 
the  methods  there  shown  are  seen  to  entail  combinations  of  maximum  windowing 
and  no  windowing.  In  the  covariance  method,  the  characteristic  constants  are 
chosen  to  be  k2  =  p+i  and  k2=l.  This  particular  choice  is  seen  to  provide 
the  largest  number  of  lag  products  in  the  autocorrelation  estimates  (9.4) 
over  which  no  data  windowing  is  involved.  As  might  be  expected,  the 
covariance  method  generally  provides  the  best  AR  modeling  and  spectral 
resolution  performance  when  compared  with  the  remaining  members  of  the 
adaptive  autocorrelation  estimator  class.  With  this  in  mind,  unless  speeial 
considerations  dictate  otherwise,  the  covariance  method  is  the  most 
preferable  choice  for  an  adaptive  implementation. 

In  the  three  remaining  methods  listed  in  Table  9.1,  it  is  seen  that  a 
maximum  amount  of  prewindowing,  postwindowing,  or,  both  are  being  employed. 
It  is  then  not  surprising  that  each  of  these  methods  will  generally  provide 
relatively  poor  modeling  performance.  This  will  be  particularly  true  for 
data  lengths  N  which  are  not  significantly  larger  than  the  AR  order  parameter 
p.  As  the  data  length  N  increases  so  that  N>>p,  however,  each  of  the  four 
methods  will  provide  comparable  modeling  performance.  This  is  due  to  the 
fact  that  the  windowed  portions  of  the  data  matrix  will  play  a 

A 

proportionately  smaller  role  in  the  estimate  R  as  N  increases.  An 
appreciation  for  this  behavior  is  readily  obtained  upon  examination  of  the 
kernel  matrix  (9.7). 

As  suggested  earlier,  the  primary  reason  for  preferring  the  adaptive 
autocorrelation  estimator  (9.4)  over  the  standard  unbiased  estimator  (5.4)  is 
that  the  former  may  be  used  to  effect  a  computationally  efficient  adaptive  AR 
modeling  method.  To  gain  an  insight  as  to  why  this  is  so,  let  us  first 
substitute  the  autocorrelation  matrix  estimate  (9.5)  into  the  fundamental  AR 
modeling  expression  (9.3).  The  required  parameters  of  the  AR(p)  model  are 
then  found  by  solving  the  resultant  system  of  normal  equations 

Xn‘%^  =  (N+kz-k^lbol2  e.!  (9.8) 

in  which  the  normalizing  parameter  b0  is  to  be  selected  so  that  the  first 
component  of  a^  is  one.  The  data  matrix  product  %  %  in  this  expression  is 


METHOD 

CONSTANT 

*1 

CONSTANT 

k2 

STATISTICAL 
PROPERTIES 
OF  R 

1.  Covariance 

(No  windowing) 

P+1 

1 

(i)  unbiased 
(ii)  consistent 

2.  Full  Prewindowing 
No  Postwindowing 

1 

1 

(i)  biased 
(ii)  consistent 

3.  Full  Postwindowing 
No  Prewindowing 

P+1 

P+1 

(i)  biased 
(ii)  consistent 

4.  Autocorrelation 
(Full  pre  and 
postwindowing) 

1 

P+1 

(i)  biased 
(ii)  consistent 
(iii)  Toeplitz 

Table  9.1  Adaptive  AR  Autocorrelation  Estimation  Methods 


seen  to  completely  characterize  the  desired  autoregressive  parameter  vector 
a^  associated  with  the  N  time  series  observations  (9.1). 

As  the  time  index  N  is  incremented  by  one  (i.e.,  the  (N+l)st  time  series 
observation  x(N+l)  becomes  available),  it  is  seen  that  a  new  system  of  normal 
equations  of  form  (9.8)  will  arise  in  which  the  index  N  is  replaced  by  N+l . 
The  resultant  data  matrix  product  Xjij+i*Xn+1  which  characterizes  this  new 
system  of  equations  will  in  turn  give  rise  to  the  updated  autoregressive 
parameter  vector  aj^+j .  We  can  continue  this  systematic  procedure  to  generate 
the  updated  autoregressive  parameter  vectors  aty+2 »  1n+3 »  etc.  as  the  new  time 
series  observations  x(N+2),  x(N+3),  etc.  become  available.  The  ability  to 
evolve  an  adaptive  solution  procedure  when  using  this  approach  will  be  then 
dependent  on  our  obtaining  an  effective  method  for  updating  the  data  matrix 
products  as  N  evolves. 

Adaptive  Algorithm:  k2  =  1 

The  adaptive  expression  relating  the  successive  data  matrix  products 
will  be  considerably  eased  if  the  constant  k2  is  selected  so  as  to  provide 
either  no  or  full  postwindowing.  To  illustrate  this  point  let  us  first 


72 


examine  the  case  of  nonnostwindowina  for  which  k2=l  while  allowing  kj  to  take 
on  anjr  value  in  [l,p+l],  Pros  examination  of  the  defining  expression  (9.6), 
it  is  seen  that  the  data  matrix  Xn+j.  ®ay  be  obtained  by  appending  a  row 
vector  to  the  bottom  of  data  matrix  X^.  This  results  in  the  following 
recursion  on  the  data  matrix  products 

%+l*%+l  =  ^  +  »+ll*i+i  N  1  P+1  <9«9> 

in  which  x^+1  is  the  above  mentioned  appended  lx(p+l)  row  vector 

S^+1  =  [x(N+l) ,  x(N).  ....  x(N+l-p)l  (9.10) 

It  is  important  to  note  that  this  data  matrix  product  recursive  expression 
commences  at  N=p+1  which  corresponds  to  the  first  time  index  at  which  X^X^ 
has  its  full  form.  Thus,  the  matrix  X^j+^Xp+i  serves  the  role  of  initializing 
the  above  recursive  relationship.  The  elements  of  this  initializing  matrix 
are  obtained  from  expression  (9.4)  upon  setting  k2=l  and  N-p+1,  that  is 

p+1 

*p+l  Xp+i(i.j)  -  J  x(k+l-i)  x(k+l-j)  1  i  i  1  P+1  (9.11) 

k=kj  1  i  j  I  P+1 

It  is  interesting  to  note  that  although  each  member  of  the  nonpostwindow 
class  (as  identified  by  k2=»l  and  kie[l,p+l])  will  be  governed  by  the  same 
recursion  (9.9),  they  will  each  give  rise  to  a  generally  different  set  of 
autocorrelation  estimates.  This  is  due  to  the  fact  that  the  initializing 
matrix  (9.11)  will  be  generally  different  for  various  choices  of  k^. 

From  recursive  expression  (9.9),  it  is  seen  that  successive  data  matrix 
products  differ  by  the  rank  one  matrix  x^+^ .  This  simple 

interrelationship  will  in  turn  enable  us  to  obtain  a  recursive  expression  for 
the  data  matrix  product  inverses  [X£  X^]~l.  We  are  interested  in  these 
inverses  since  they  will  be  ultimately  used  when  solving  expression  (9.8)  for 
the  AR  model  parameters.  This  required  matrix  inverse  recursion  will  make 
use  of  expression  (9.9)  and  the  following  well  known  lemma 


Lemma  9.1:  Let  A  and  A+  u*  y  each  be  nonsingular  sxs  matrices  where  u 
and  x.  are  lxs  vectors,  then 


[A  +  a*  y] 


A-1  _  [A~1u*] [vA-1] 

( 1  +  y  A1  u* ) 


(9.12) 
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Upon  setting  A  =  and  u  =  v  *  2n+i  in  this  lemma,  the  required  recursive 

matrix  inverse  expression  is  found  to  be 

e 

*  e  ,  VN+1  yN+1 

IXN+1  ^J+lJ  1  =  I*N  %1_1  “  _ Z _ ~  _  for  N  l  p+kx 

1  +  Z*+l  *N+1 

where  (9.13) 

yN+l  =  iN+ll^XN]-1 

In  using  this  matrix  inverse  recursion,  it  is  important  to  note  that  it  is 
only  applicable  for  time  indices  N  2  P+kj.  This  is  a  direct  consequence  of 
the  fact  that  the  data  matrix  products  Xj$  %  are  singular  for  all  time 

indices  N  <  p+k^  in  the  nonpostwindowing  case  k2=l .  To  use  this  recursive 

approach,  it  is  therefore  necessary  to  first  compute  the  initializing  matrix 
inverse  [XjJ  Xfl]-*  for  N  =  p+kj  using  a  standard  matrix  inversion  routine  such 
as  Gaussian  elimination.  Subsequent  matrix  inverses  for  N  >  p+kj  may  be  then 
efficiently  obtained  upon  using  recursion  formula  (9.13). 

To  complete  the  adaptive  AR  modeling  procedure,  we  next  incorporate  the 
data  matrix  product  inverse  routine  (9.13)  into  the  AR  modeling  equations 
(9.8) .  A  little  thought  will  convince  oneself  that  the  simple  three  step 

procedure  outlined  in  Table  9.2  will  provide  the  required  adaptive 
autoregressive  parameter  vector  procedure.  The  second  step  is  seen  to  yield 
the  unnormalized  solution  to  expression  (9.8)  with  N  replaced  by  N+l  while 
the  third  step  ensures  that  the  first  component  of  a^  is  one.  In  terms  of 
computational  complexity,  an  examination  of  equation  (9.13)  indicates  that 

2(p+l)  operations  will  be  required  for  updating  [X^X^]  .  The  resultant 

autoregressive  parameter  vector  solution  as  represented  by  steps  2  and  3  of 
Table  9.2  will  require  an  additional  (p+1)  operation*.  Thus,  the 

computational  complexity  of  the  nonpostwindowing  adaptive  algorithm  (i.e., 
k2=l)  is  then  o(p2) .  This  algorithmic  approach  is  applicable  for  any 

selection  of  the  constant  k*  with  the  most  likely  choices  being  from  the 

range  [l,p+l] .  The  most  useful  implementation  of  this  adaptive  algorithm 
corresponds  to  the  selection  kj^p+l.  In  this  case,  the  covariance  method  as 
specified  by  k^=p+l,  k2=l  is  obtained.  As  pointed  out  previously,  this 

choice  normally  provides  the  best  adaptive  AR  modeling  performance  behavior. 
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Step  0: 

Input  Data:  x(N+l),  [IflXfjl  1 

Step  1: 

Compute  [X*+j  XN+ll-1  nsing  recursion  (5.14) 

Step  2: 

Let  c  =  1  Si 

Step  3: 

aj^+2  =  c(l)-X  c.  where  c(l)  is  the  first  component  of  c,. 

Table  9.2:  Adaptive  AR  Modeling  Algorithm-Covariance  Methods 
(N0=2p+1)  and  Prewindow  (N0=p+1 )  methods. 


Adaptive  Algorithm  k2  =  p+1 

Using  similar  reasoning,  it  is  also  possible  to  evolve  an  efficient 
adaptive  algorithm  for  the  full  prewindowing  case  k2  «  p+1.  In  this 

situation,  it  is  readily  found  that  the  data  matrix  products  are  recursively 
related  according  to 

Xft+llff+l  =  XfJ%  +  Dn+i  N2P+1  (9.14) 


in  which  is  a  (p+l)x(p+l)  Toeplitx  conjugate  symmetric  matrix  with 

elements 

f  I(N+1)  x(N+l+i-j)  i  i  j 


°N(i.j) 


x (N+l )  x(N+l+i-j) 


j  l  i 


(9.15) 


Due  to  the  Toeplitz  conjugate  symmetric  property  of  the  perturbation  matrix 
Dfl,  it  will  be  possible  to  evolve  an  efficient  adaptive  method  for  inverting 
the  data  matrix  products  The  computational  complexity  of  this  matrix 


inversion  routine  will  be  also  o(p^) .  The  details  of  this  routine  are  rather 
involved  and  will  be  therefore  not  given  here  due  to  space  limitations. 
Since  the  covariance  method  is  the  most  preferable  choice  for  the  adaptive 
class  of  autocorrelation  estimators,  however,  this  omission  is  not  serious  in 
any  case.  It  is  to  be  noted  that  efficient  adaptive  lattice  structured 
algorithms  may  also  be  employed  for  updating  the  AR  parameters  [46], [47]. 

Forward-Backward  Approach 

In  some  applications,  it  is  possible  to  achieve  a  degree  of  iaiprovement 
in  the  AR  spectral  models  by  using  the  concept  of  data  time  reversal. 
Namely,  it  makes  use  of  the  observation  that  if  {x(n)J  represents  a 
wide-sense  stationary  process,  then  its  time  transposed  conjugated  image  as 
specified  by 

y(n)  =  x(s-n)  (9.16) 

will  also  be  wide-sense  stationary  for  any  choice  of  the  shift  variable  a. 
Moreover,  the  autocorrelation  sequence  of  this  time  transposed  conjugated 
image  is  readily  found  to  be  identical  to  that  of  the  original  time  series, 
that  is 

ry(n)  ■  rx(n)  (9.17) 

It  is  now  possible  to  use  this  time  transpose  property  to  effect  a  new 
autocorrelation  estimation  scheme.  In  particular,  upon  selecting  s-N-1,  the 
original  observation  set  (9.1)  is  seen  to  give  rise  to  the  following  set  of 
time  transposed  conjugated  elements 

y(n)  =  X(N+l-n)  1  i  n  £  N  (9.18) 

If  these  time  reversed  observations  are  incorporated  into  expression  (9.4), 
it  will  be  generally  found  that  a  new  set  of  autocorrelation  estimates  will 
result.  In  particular,  the  overall  backward  autocorrelation  matrix  estimate 
will  take  the  form. 

R  =  1  YftlN  (9.19) 

N+kj-ki 


in  which  the  elements  of  the  (N+k2-ki)x(p+l)  matrix  Y{j  are  given  by 

IN(i,j)  =  x(N+l-kj-i+j)  1  i  i  i  N+k2-kj  (9.20) 

1  i  j  i  P+1 
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Although  the  forward  and  backward  autocorrelation  matrix  eatimatea  (9.S) 
and  (9.19)  will  be  generally  different  (except  for  the  autocorrelation  choice 
kj  «  1,  kj  =  p+1) ,  they  are  each  aeeking  to  estimate  the  same  underlying 
autocorrelation  matrix  R.  It  then  follows  the  so-called  forward-backward 
estimate  as  specified  by 

R  =  - - -  (9.21) 

2(N+k2-kx) 


will  provide  an  additional  improvement  in  autocorrelation  estimation 
fidelity.  This  is  due  to  the  fact  that  each  of  the  entities  X^X^  and 
will  contain  lag  products  not  found  in  the  other. 

The  additional  autocorrelation  estimation  fidelity  achieved  in  using 
this  time  transposition  approach  typically  results  in  a  marginal  improvement 
in  spectral  estimation  performance.  Fortunately,  this  improvement  is  not 
accrued  at  the  cost  of  an  excessive  increase  in  computational  complexity. 
This  is  due  to  the  fact  that  the  matrices  X^  and  Yfc  which  form  R  are  Toeplitz 
type.  It  is  therefore  possible  to  devise  efficient  algorithms  that  will 
solve  the  system  of  equations 

[XflXN  +  Y^YnIun  =  a  ei  (9.22) 

in  which  the  computational  complexity  is  o(p2)  . 

AR  Model  Order  Determination 

One  of  the  principal  considerations  in  obtaining  AR  models  from  raw  time 
series  observations  is  that  of  model  order  selection.  It  has  been  observed 
that  when  p  is  selected  too  low,  there  will  be  generally  too  few  model  poles 
to  adequately  represent  the  underlying  spectrum.  On  the  other  hand,  too  high 
of  a  choice  for  p  will  typically  result  in  spurious  effects  (e.g.,  false 
peaks)  in  the  spectral  estimate.  With  these  thoughts  in  mind,  investigators 
have  proposed  various  order  selection  procedures.  Three  of  the  more  widely 
used  techniques  are  Akiake's  final  prediction  error  method  as  well  as  his 
information  criterion  [1],[2],[4],  and,  Parzen's  'criterion  autoregressive 
transfer'  function  [54] .  Although  these  procedures  typically  work  well,  they 
can  yield  unsatisfactory  performance  in  some  cases  (e.g.,  see  ref.  [34]  and 
[62]).  The  user  is  therefore  cautioned  to  use  discretion  in  applying  the 
above  and  other  model  order  determination  procedures.  The  method  to  be 


ultimately  used  should  be  determined  through  empirical  experimentation  based 
on  time  series  related  to  the  specific  application  under  consideration. 

It  is  possible  to  apply  a  conceptionally  straightforward  procedure  for 
model  order  selection  which  does  not  possess  many  of  the  drawbacks  alluded  to 
above.  It  is  based  on  the  observation  that  in  the  case  where  perfect  AR 
autocorrelation  lag  values  are  given,  the  (p+l)x(p+l)  autocorrelation  matrix 
R  with  elements 

R(i,j)  =  rx(i-j)  1  <  i,j  i  p+1  (9.23) 

will  have  rank  p+1  so  long  as  p  is  less  than  or  equal  to  the  order  of  the 
underlying  AR  process  (hereafter  taken  to  be  pj)  .  For  all  values  of  p 
greater  than  pj,  however,  the  rank  of  the  (p+l)x(p+l)  autocorrelation  matrix 
will  be  P}.  Thus,  to  determine  the  proper  rank  selection  in  the  idealistic 
case  of  perfect  autocorrelation  lag  information,  we  simply  increase  the 
parameter  p  until  the  rank  of  R  is  less  than  full  (i.e.,  less  than  p+1). 
This  will  occur  at  p  =  pj+1,  thereby  giving  us  the  appropriate  order 
selection.  It  should  be  noted  that  when  the  autocorrelation  lags  being  used 
don't  correspond  to  an  AR  process,  then  the  matrix  R  will  be  generally  of 
full  rank  for  all  p  ^  1. 

In  the  more  realistic  case  in  which  raw  time  series  are  used  to  form  the 

A 

autocorrelation  matrix  estimate  R,  the  rank  of  this  matrix  will  be  typically 
full  for  all  values  of  p.  This  will  be  true  even  when  the  time  series  is  an 
AR  process.  This  seeming  contradiction  arises  due  to  statistical  errors 
inherent  in  any  autocorrelation  lag  estimation  procedure  that  might  be  used 

A 

in  forming  R.  Nonetheless,  even  though  R  will  have  full  rank,  it  will  be 
generally  found  that  when  p  >  pj,  this  matrix  will  have  (p-pj)  of  its 
eigenvalues  'close*  to  zero.  Thus,  an  order  selection  procedure  which  has 
provided  satisfactory  performance  is  one  entailing  examination  of  the 

A 

eigenvalue  behaviour  of  the  autocorrelation  matrix  estimate  R  as  a  function 
of  p.  The  appropriate  order  choice  will  be  that  value  of  p,  denoted  by  p^, 

A 

for  which  R  has  (p-pj)  of  its  eigenvalues  sufficiently  close  to  zero  for  all 
p  >  P}.  A  particularly  attractive  method  (i.e.,  the  SVD  method)  for 

implementing  this  procedure  was  given  in  Section  X. 
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X.  ARMA  Modeling:  Adaptive  Implementation 

When  an  adaptive  implementation  of  the  ARMA  node ling  Methods  as 
described  in  Sections  VI  and  VII  is  desired,  it  will  be  necessary  to 
incorporate  autocorrelation  lag  estimate  procedures  which  are  compatible  with 
an  adaptive  implementation  (the  unbiased  estimator  is  not  compatible).  In 
particular,  we  shall  now  examine  a  class  of  estimators  which  provides  an 
adaptive  mechanism  for  estimating  the  elements  of  the  autocorrelation  matrix 

Rl  as  required  in  expression  (6.6) .  This  class  of  adaptive  estimators  will 
be  governed  by  the  relationship 

N— q— 2+k2 

Rl(i.j)  =  1  5  x(  k+l-i) x  (k+q+2- j )  l£i£t  (10.1) 

N+k2-kl-q-l  k=kj 

It  is  apparent  that  this  expression  provides  an  estimate  for  the  lag  element 
rx(q+l+i-j)  which  is  the  (i.j)**1  element  of  the  autocorrelation  matrix  Rj  as 
defined  in  equation  (6.2).  The  fixed  constants  kj  and  k2  which  characterize 
this  estimator  are  normally  selected  so  that  the  number  of  lag  products  there 
used  (i.e.,  N+k2-kj-q-i)  equals  or  exceeds  p+1 .  This  choice  will  generally 

A  A 

ensure  the  ir "tvtibility  of  matrix  R*WRj  and  thereby  a  unique  solution  for 
the  autoregressive  parameter  when  using  expression  (6.6).  For  reasons  which 
will  be  shortly  made  apparent,  these  constants  are  usually  further 
constrained  to  satisfy  1  £  k-^  ^  t  and  1  £  k2  £  p+1  although  other  choices  are 
possible. 

Each  autocorrelation  estimator  in  the  adaptive  class  (10.1)  will  be 
identified  by  a  particular  choice  of  the  two-tuple  (k2,k2) .  Moreover,  each 
estimator  in  this  class  will  provide  a  generally  different  set  of 
autocorrelation  lag  estimates  from  the  set  of  time  series  observations  x(n) 
for  1  £  n  £  N.  Clearly,  our  ultimate  desire  is  to  select  that  estimator 
which  generally  provides  the  best  ARMA  modeling.  The  covariance  estimator  as 
identified  by  kj=t  and  k2=l  furnishes  an  obvious  choice.  Before  treating 
specific  estimators,  however,  let  us  first  examine  the  general  adaptive 
estimator  (10.1). 

The  primary  reason  as  to  why  the  adaptive  estimator  (10.1)  lends  itself 
to  an  adaptive  implementation  is  due  to  the  algebraic  structure  implicitly 
contained  within  its  definition.  Namely,  the  autocorrelation  matrix  estimate 
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as  formed  from  the  entries  (10.1)  may  be  always  representable  in  the 
convenient  matrix  product  format 


R1 


(10.2) 


in  which  the  (N+k2-ki-q~l)x(p+l)  data  matrix  Xfj  has  its  elements  specified  by 
%(i,j)  =  x(k1+q+l+i-j)  1  i  i  i  N+k2-k1-q-l  (10.3) 

1  <  j  i  P+1 

while  the  (N+k2~ki-q-l)xt  data  matrix  Y^  has  elements 

YN(i,j)  =  x(k2  +  i-j)  1  i  i  i  N+k2-k1-q-l  (10.4) 

1  1  j  i  t 

A  simple  matrix  manipulation  will  prove  the  equivalencies  of  expressions 
(10.1)  and  (10.2).  We  again  adopt  the  convention  of  setting  to  zero  any 
element  entries  of  Xfl  or  Yjq  for  which  x(n)  lies  outside  the  observation 
interval  1  <  n  £  N,  and,  we  also  attach  the  subscript  N  to  these  data 

matrices  so  as  to  explicitly  recognize  there  dependency  on  data  length. 

As  in  the  AR  modeling  case,  the  parameters  kj  and  k2  that  identify  the 
autocorrelation  estimator  (10.1)  can  give  rise  to  data  windowing.  To  see  why 
this  is  so,  let  us  consider  two  kernel  Toeplitz  type  matrices  which  contain, 
as  submatrices,  all  of  the  data  matrices  associated  with  the  adaptive  class 
of  ARMA  autocorrelation  estimators.  These  kernel  matrices  are  specified  by 


microcopy  resolution  test  chart 
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Upon  examination  of  expressions  (10.3)  and  (10.4).  it  is  readily  established 
that  the  data  matrix  (or  Yjj)  may  be  identified  with  that  snbmatrix  of  the 
kernel  matrix  ^  jj  (or  Vn>  composed  of  its  through  (N-q-2+k2),t  rows 

inclusively.  Thus,  corresponding  to  each  adaptive  autocorrelation  estimator 
(i.e.,  choice  of  pair  (k^,k2)).  there  will  be  an  associated  pair  of  data 
matrices  obtained  by  using  this  row  identification  scheme. 

The  zeroes  which  appear  in  the  upper  right  corner  of  kernel  matrix  ^ 
are  there  due  to  the  implicit  prewindow  assumption  that  x(n)  “  0  for  2-t£n£0. 
This  unrealistic  restriction  on  an  unobserved  segment  of  the  time  series  is 
to  be  normally  avoided.  It  is  to  be  noted  from  the  representation  for  Y^ 
that  a  selection  of  k-| > t  will  avoid  any  data  prewindowing.  On  the  other 
hand,  a  degree  of  prewindowing  is  incorporated  whenever  kj  is  such  that  1 
<kjJ_t-l .  Thus,  as  kj  ranges  over  the  integers  1  to  t,  the  amount  of 
prewindowing  varies  from  full  at  ki=l  to  none  at  k^  =  t. 

In  a  similar  fashion,  the  zeroes  which  arise  in  the  lower  left  corner  of 
kernel  matrix  are  there  due  to  the  implicit  postwindow  assumption  that 
x(n)  -  0  for  N+l£n$N+p.  This  contrived  assumption  on  an  unobserved  segment 
of  the  time  series  is  also  to  be  avoided.  Upon  examination  of  the  kernel 
matrix  N,  it  is  apparent  that  postwindowing  may  be  avoided  by  selecting  k2  £ 
1.  It  is  also  clear  that  the  degree  of  postwindowing  varies  from  none  at 
k2=l  to  full  at  k2=p+l. 

The  four  most  appealing  choices  for  adaptive  estimators  are  identified 
in  Table  10.1  in  which  it  is  noted  that  each  involves  combinations  of  maximum 
windowing  and  no  windowing.  The  covariance  method  entails  that  particular 
combination  of  no  prewindowing  (i.e.,  k*  =  t)  and  no  postwindowing  (i.e., 
k2=l).  This  method  is  seen  to  provide  the  laraest  number  of  lag  products 
(i.e.,  N-q-t)  in  estimator  (10.1)  for  which  no  data  windowing  it  involved. 
As  might  be  expected,  the  covariance  method  typically  provides  the  best 
modeling  performance  from  the  ARMA  adaptive  class  of  autocorrelation 
estimators. 

The  three  other  methods  listed  in  Table  10.1  are  seen  to  employ  either 
full  prewindowing,  full  postwindowing,  or,  both.  It  is  clear  that  the 
modeling  performance  capabilities  of  each  of  these  three  methods  will  tend  to 
be  relatively  poor  when  the  data  length  N  is  only  marginally  larger  than  the 
ARMA  order  parameter  p  or  the  parameter  t.  On  the  other  hand,  for  the  ease 
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in  which  N  it  much  larger  than  either  p  or  t,  each  of  the  Methods  listed  in 
Table  10.1  will  provide  coaparable  Modeling  perforaance.  This  it  a 
consequence  of  the  fact  that  the  windowed  portions  of  the  data  Matrices 
and  n  play  a  proprtionately  saaller  role  in  the  estinate  of  Rj  as  N 
increases.  In  any  case,  unless  special  considerations  dictate  otherwise,  the 
covariance  Method  is  the  aost  preferable  choice  for  an  adaptive 
iapleaentation. 


METHOD  CONSTANT  CONSTANT  STATISTICAL 


kl 

k2 

PROPEJ 
OF  &N 

{TIES 

1. 

Covariance 

t 

1 

(i) 

unbiased 

(No  windowing) 

(ii) 

consistent 

2. 

Full  Prewindowing 

1 

1 

(i) 

biased 

No  Postwindowing 

(ii) 

consistent 

3. 

Full  Postwindowing 

t 

P+1 

(i) 

biased 

No  Prewindowing 

(ii) 

consistent 

4. 

Autocorrelation 

1 

P+1 

(i) 

biased 

(Full  pre  and 

(ii) 

consistent 

postwindowing) 

( iii) 

Toeplitz 

Table  10.1:  Four  ASMA  Adaptive  Autocorrelation 
Estiaator  Methods 

In  order  to  provide  the  reason  as  to  why  aenbers  of  the  adaptive  class 
of  autocorrelation  estimators  are  aaenable  to  an  adaptive  iapleaentation,  let 
na  substitute  the  matrix  product  representation  for  Rj  as  given  by  expression 
(10.2)  into  the  basic  ARMA  aodeling  equation  (6.6) .  The  resultant 
autoregressive  paraaeter  vector  is  then  obtained  by  solving  the  noraal 
equations 

*$*N*fi*N  AN  "  a  Rl  (10.6) 
where  the  weighting  astrix  V  has  been  set  equal  to  the  identity  aatrix  while 
the  uoraalizing  constant  a  is  selected  so  that  the  first  coaponent  of  is 
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one.  From  thi*  expression  it  is  apparent  that  the  data  Matrix  product 
completely  identifies  the  ARMA  model's  autoregressive  parameter 
vector.  In  order  to  compute  a^,  it  will  be  then  necessary  to  compute  the 
data  matrix  product's  inverse  at  each  value  of  N  where  the  autoregressive 
parameter  vector  is  required.  This  can  be  a  particularly  imposing 
computational  task  if  real  time  signal  processing  is  to  be 
archieved. 

Adaptive  Algorithmi  kj  ■  1 

When  the  autoregressive  parameter  vector  is  required  at  each  time  index 
N,  it  will  be  beneficial  to  effect  an  adaptive  method  fox  updating  the  data 
matrix  product  in  expression  (10.6).  This  adaptive  implementation  is  readily 
achieved  for  the  nonpostwindowing  case  k2*l  in  which  kj  may  take  on  any 
appropriate  value  (e.g.,  l^kj^t).  Namely,  upon  examination  of  expression 
(10.5)  with  k2~l,  it  is  seen  that  the  data  matrices  X^+l  *nd  YN+1  are 
obtained  by  appending  appropriate  row  vectors  to  the  bottom  of  data  matrices 
Xjj  and  Tty,  respectively.  Using  this  property,  the  following  recursion  on  the 
data  matrix  product  is  obtained 

yN+1  xN+1  =  yN*N  +  ZN  SN  N  1  4  (10.7a) 

where  xjj  and  y^  are  the  above  mentioned  lx(p+l)  and  lxt  row  vectors,  that  are 
appended  to  XN  and  Tr.  respectively.  These  vectors  are  specified  by 

*N  ■=  [*(N+1).  x(N) ,  ...,  x(N+l-p)l  (10.7b) 

yjj  *  Ix(N-q) ,  x(N-q-l),  ....  x(N+l-q-t)]  (10.7c) 

It  is  to  be  noted  that  recursive  expression  (10.7a)  holds  only  for  time 
indices  N^t  since  t  is  the  first  time  index  where  takes  on  its  full 

algebraic  form.  With  this  in  mind,  Y*X*  then  serves  the  role  of  an 
initializing  matrix  for  this  recursion.  Although  the  perturbation  matrix 
yj^N  in  this  recursion  does  not  depend  on  the  parameter  kj,  the  initialising 
matrix  Y*X*  does.  As  such,  the  sequence  of  matrix  products  as  generated  by 
expression  (10.7a)  will  be  different  for  various  choices  of  kj. 

The  full  data  matrix  product  as  required  in  expression  (10.6)  may  be 
readily  obtained  from  relationship  (10.7)  and  takes  on  the  following 
recursive  form 

xn+iyn+iyn+ixn+i  “  XNYNYNXN  +  +  +  (YnZn)  iJ^N 

N  l  t  (10.8a) 

where  is  the  lx(p+l)  vector  given  by 
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An  examination  of  this  recursive  expression  indicates  that  t(p+l)  operations 
are  required  to  compute  while  another  2(p+l)^  operations  are  expended  in 
updating  the  full  matrix  product  (10.8a).  In  arriving  at  this  computational 
requirement  measure,  it  has  been  tacitly  assumed  that  the  matrix  products 
yN*N  *nd  are  available. 

When  updates  of  the  autoregressive  parameter  vectors  are  not  required 
at  each  time  index  N,  we  could  then  use  recursions  (10.7)  and  (10.8)  to 
compute  the  data  matrix  products  *1$%  tad  in  a  computationally 

efficient  manner.  At  those  time  instants  at  which  the  evaluation  of  a^  is 
required,  we  would  then  simply  solve  the  ARMA  modeling  equations  (10.6) .  If 
standard  procedures  are  used,  this  solution  will  entail  on  the  order  of 
(p+l)3  computations. 

In  various  applications,  however,  it  may  be  neceasary  to  compute  the 
autoregressive  parameter  vector  at  each  (or  nearly  each)  value  of  time  N. 
For  such  cases,  it  would  be  much  more  advantageous  to  replace  the  recursion 
(10.8)  by  a  recursion  for  the  inverse  matrix  product  [X$TnY£In]~1 .  To  effect 
this  recursion,  we  note  from  relationship  (10.8)  that  the  matrix  products 
at  two  contiguous  time  indices  (i.e.,  N  and  N+l)  differ  by  the  sum 
of  three  rank  one  matrices.  Using  this  fact,  it  is  then  possible  to  apply 
Lemma  9.1  successively  three  times  to  effect  the  desired  matrix  product 
inverse  recursion.  The  main  steps  of  this  recursive  inversion  are  listed  in 
Table  10.2.  It  is  important  to  note  that  this  recursion  commences  at  N  ** 
q+p+kj+1  which  corresponda  to  the  first  time  instant  at  which  the  matrix 
product  is  generally  invertible.  Steps  3  through  6  provide  the 

mechanism  for  this  matrix  product  inversion  while  step  7  gives  the  required 
solution  to  the  ARMA  modeling  equations  (10.6).  In  term  of  computational 
complexity,  it  is  readily  shown  that  the  number  of  multiplication  and 
addition  operations  required  to  implement  this  algorithm  is  of  order  p(t+3p) 
for  each  data  point  update. 

The  adaptive  algorithm  described  in  Table  10.2  is  for  the  particular 
nonpostwindowing  aelection  k2“l  wherein  the  parameter  kj  will  be  typically 
selected  to  satisfy  1  1  kj  t.  As  suggested  earlier,  the  covariance  method 
identified  by  kj«t  and  k2*l  generally  provides  the  best  overall  modeling 
performance  for  the  class  of  adaptive  estimators.  Ve  may  therefore  use  the 
adaptive  ARMA  modeling  method  to  provide  an  efficient  procedure  for 
recursively  implementing  the  desirable  covariance  method.  As  a  final  note. 
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although  it  is  possible  to  effect  adaptive  implementations  for  other  choices 
ki  (i.e.,  ki#l) ,  the  resultant  algorithm  is  of  a  much  more  compler  nature. 
Since  the  covariance  method  is  almost  invariably  used,  however,  we  shall  be 
content  with  the  nonpostwindowing  algorithm. 

It  is  also  possible  to  provide  a  lattice  implementation  of  the  adaptive 
algorithm  here  developed  [19]  and  [49].  This  will  entail  restricting  t  •  p 
thereby  imparting  a  decrease  in  spectral  estimation  performance.  The 
advantage  accrued  by  using  the  lattice  implementation  is  computational  in 
nature.  In  particular,  the  number  of  operations  to  update  the  lattice 
network  is  o(p  up)  for  each  new  time  series  observation. 
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STEP  0 

The  input  to  commence  the  algorithm  at  N  *■  q+p+kj+1  it 

tnxn  and  ^xnynyn1n^-1 

STEP  1 

N  “  q+p+kj+1 

STEP  2 

Compute  using  expression  (10.7) 

STEP  3 

AN  =  YN  YN*N 

STEP  4 

fil  *  *N*  £l  “  AN*  Ai-1  =  [X^n^I-1 

Compute  [Aj  +  u|  £j]-1  using  Lesua  9.1 

STEP  5  «2  *  2g  =  %*  A2_1  =  *A1  +  El  2lJ-1 

Compute  [A2  +  22  22^”^  Lemma  9.1 

STEP  6  23  =  (2,,^)  Sj,.  Vg  =  1N.  A3-1  *=  [A2  +  22  X21_1 

Compute  [A3  +23  23 l-1  =  [^N+l^N+l^N+l^N+l^-1  usinB 
Lemma  9.1 

STEP  7  c  =  [Xn+Jyn+iYn+*Xn+i1-1^1 

aj(+1  *  c(l)_1£  where  c(l)  it  the  firat  component  of  c 

STEP  8  Let  N  =  N+l,  GO  TO  STEP  2 


TABLE  10.2:  Adaptive  Algorithm  for  Computing  ajj+i 
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XII.  Conclusions 

A  philosophy  directed  towards  the  rational  node ling  of  wide-sense 
stationary  time  series  has  been  presented.  It  is  explicitly  based  upon  the 
Yule-Walker  equations  which  characterize  the  autocorrelation  sequence 
associated  with  the  rational  tisie  series  being  modeled.  In  particular,  the 
key  concept  is  that  of  using  an  overdetermined  set  of  Tule-Walker  equation 
evaluations  for  estimating  the  parameters  of  a  postulated  rational  model. 
This  approach  has  been  found  to  reduce  the  data  induced  hypersensitivity  of 
the  parameter  estimates  in  comparison  to  many  of  the  more  popular  parametric 
approaches  which  invoke  a  minimal  set  of  evaluations  for  obtaining  the 
parameter  estimates.  These  latter  methods  include  the  Burg  algorithm,  many 
LMS  methods,  and  the  one-step  predictor.  Comparative  examples  illustrating 
this  reduced  hypersensitivity  have  been  given  in  which  the  modeling  is  based 
on  both  exact  autocorrelation  lag  information,  and,  raw  time  series 
observations. 

The  method  of  singular  value  decomposition  was  next  introduced  and  was 
used  to  obtain  an  effective  rational  model  order  determination  procedure  as 
well  as  providing  a  novel  rational  modeling  procedure  whose  performance  has 
been  empirically  found  to  often  exceed  that  of  existing  techniques.  Studies 
are  currently  under  way  to  more  effectively  use  this  SVD  adaption  for 
achieving  yet  further  performance  improvements. 
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